Commit Graph

6238 Commits

Author SHA1 Message Date
Adin Scannell 6b558bb433 Drop nodejs test that started spontaneously failing.
It is unclear exactly what happened in the DNS response that has caused
this test to start breaking. However, since this is unrelated to any code
change, this can be attributed to a non-hermetic or broken test case.

See master failure:
https://buildkite.com/gvisor/pipeline/builds/10462#ae46ee7c-855c-4efe-8165-f0c694557cf9

This may be related to https://github.com/nodejs/node/issues/28790, where
older versions of node are not parsing this field correctly? However, we
would like to retain other tests from the same older version of node.

For posterity, the current serial field appears as:

; <<>> DiG 9.17.19-1-Debian <<>> nodejs.org -t SOA +multiline
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 56131
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;nodejs.org.            IN SOA

;; ANSWER SECTION:
nodejs.org.             3402 IN SOA meera.ns.cloudflare.com. dns.cloudflare.com. (
                                2264470260 ; serial
                                10000      ; refresh (2 hours 46 minutes 40 seconds)
                                2400       ; retry (40 minutes)
                                604800     ; expire (1 week)
                                3600       ; minimum (1 hour)
                                )

;; Query time: 59 msec
;; SERVER: 127.0.0.1#53(127.0.0.1) (UDP)
;; WHEN: Thu Dec 09 10:35:57 PST 2021
;; MSG SIZE  rcvd: 102

PiperOrigin-RevId: 415308624
2021-12-09 11:17:16 -08:00
Adin Scannell dedb7e6ca1 Align Context API with kernel internals.
This change adapts the existing context to use more suitable non-channel-based
methods. This is a requisite for migrating the kernel internals to a
sleeper-based notification mechanism.

The last uses of amutex outside those migrated as part of this change were
dropped in a previous change. Since amutex depends on the channel-based
implementation, this package is also deleted as part of this change.

PiperOrigin-RevId: 415189675
2021-12-08 23:51:37 -08:00
gVisor bot ba86510559 Internal change.
PiperOrigin-RevId: 415015027
2021-12-08 08:59:36 -08:00
Fabricio Voznika 9768009a79 Don't eat error from epoll_ctl EPOLL_CTL_ADD
Docker maps stdin to `/dev/null` which doesn't support epoll. Host FD
was ignoring the error and suceeding the epoll_ctl call from the
container, giving false impressing that epoll would be notified.

This required plumbing failure to all waiter.Waitable.EventRegister
callers and implementers.

Closes #6795

PiperOrigin-RevId: 414797621
2021-12-07 12:36:00 -08:00
Ayush Ranjan d62190f8b5 Fix lisafs bug which tramples dentry UID on remote revalidation.
PiperOrigin-RevId: 414483232
2021-12-06 10:35:24 -08:00
Adin Scannell 3f2ffc9f0c Allow reading for mixed atomic semantics.
This relaxes constraints on mixed atomic / lock protected fields. We
explicitly allow reads in this case, since this should be safe.

PiperOrigin-RevId: 414476414
2021-12-06 10:12:13 -08:00
Zeling Feng 969fb6fb84 Update expectations for generic_dgram_socket_send_recv_test
prepare for removing the docker packetimpact runner entirely

Updates #6835

PiperOrigin-RevId: 414065450
2021-12-03 18:28:03 -08:00
Zeling Feng 791b9428a6 Use net.InterfaceByName to retrieve runsc DUT's device ID.
PiperOrigin-RevId: 413941742
2021-12-03 08:31:34 -08:00
Zach Koopmans 74536aba2c [benchmarks] Don't run vfs1 benchmarks anymore.
PiperOrigin-RevId: 413860206
2021-12-02 23:58:36 -08:00
Ayush Ranjan 15ecf9aaaa Move HostConnectedEndpoint and SCMConnectedEndpoint to transport package.
This is needed so that connectioned endpoint in the transport package can use
this endpoint to implement host FD based binded endpoints.

I had to simplify some other dependencies to make this possible.
- Removed uniqueid's dependency on transport package completely.
- Removed SCMConnectedEndpoint and HostConnectedEndpoint's dependency on
  control package so they could be moved to transport. control already depends
  on transport.
- scmRights struct from fsimpl/host/control.go had to be moved into transport
  so that  HostConnectedEndpoint could be implemented. But scmRights.Fill()
  could not be moved because it inherently depends on making
  vfs.FileDescriptions which depends on vfs which in turn depends on transport.
  So now that scmRights -> vfs.FD conversion happens in the syscall package.

PiperOrigin-RevId: 413839350
2021-12-02 21:16:58 -08:00
gVisor bot 88676421c2 Merge pull request #6857 from tanjianfeng:blog-perf
PiperOrigin-RevId: 413809335
2021-12-02 17:34:01 -08:00
Zeling Feng 40355372f9 Allow DUT binaries to define their own flags
Updates #6835

PiperOrigin-RevId: 413772074
2021-12-02 14:35:15 -08:00
Ghanan Gowripalan c0d0937bb0 Support NATing ICMPv4 Echo packets
Updates #5915.

PiperOrigin-RevId: 413738470
2021-12-02 12:11:59 -08:00
Andrei Vagin b2f8b495ad cgroup/cpuset: handle the offset argument of write methods properly
offset is an offset in a file, so here is no sense to
drop first "offset" number of bytes from a buffer.

Reported-by: syzbot+b9610cff22c10d9bead4@syzkaller.appspotmail.com
PiperOrigin-RevId: 413553935
2021-12-01 17:58:07 -08:00
Ghanan Gowripalan 054229a46b Extract NAT types
...to be shared across tests.

PiperOrigin-RevId: 413552100
2021-12-01 17:45:36 -08:00
Nayana Bidari 8777a4f8c6 Increment spurious recovery metric only for RTO.
PiperOrigin-RevId: 413504208
2021-12-01 14:00:10 -08:00
Ghanan Gowripalan 4aec33aac6 Explicitly allow new connections to be created
This change is to prepare for later changes which may determine if a
packet is sent in response to an original packet so that the reply
packet does not create a new connection.

PiperOrigin-RevId: 413501477
2021-12-01 13:50:25 -08:00
Ian Lewis 72e222cd66 Handle empty values for XDG_RUNTIME_DIR properly.
Fixes #6849

PiperOrigin-RevId: 413322019
2021-11-30 22:18:11 -08:00
gVisor bot a68a8e716b Internal change.
PiperOrigin-RevId: 413267107
2021-11-30 16:05:24 -08:00
gVisor bot 68bb74d77a Create a usage command that outputs the read wait times from the sentry.
PiperOrigin-RevId: 413167721
2021-11-30 09:20:41 -08:00
Jianfeng Tan 576c9e2874 blog: Running gVisor in Production at Scale in Ant
A blog about how Ant Group run gVisor in production at scale.

Signed-off-by: Jianfeng Tan <henry.tjf@antgroup.com>
Signed-off-by: Yong He <chenglang.hy@antgroup.com>
2021-11-30 19:45:29 +08:00
Ian Lewis afd549d79c Check for support for xgetbv
Fixes #5738

PiperOrigin-RevId: 413054140
2021-11-29 21:26:45 -08:00
Ayush Ranjan 91d4826f5b Add support for flexible filename limits for tmpfs.
PiperOrigin-RevId: 413038601
2021-11-29 19:27:46 -08:00
gVisor bot fa4e2fff8a Merge pull request #6821 from dqminh:feature/cgroupv2
PiperOrigin-RevId: 413032543
2021-11-29 18:37:42 -08:00
Andrei Vagin 32205d7a94
tools/install_containerd.sh: compare strings properly 2021-11-26 10:14:09 +00:00
Andrei Vagin 102a00ff5e
Add an empty line between header and package 2021-11-26 10:14:09 +00:00
Andrei Vagin 70ae38ab76
buildkite: CgroupDriver has to be cgroupfs 2021-11-26 10:14:09 +00:00
Andrei Vagin 1f9d74e1bb
Set native.cgroupdriver=cgroupfs for docker 2021-11-26 10:14:09 +00:00
Andrei Vagin 11700f409f
buildkite: set DEBIAN_FRONTEND=noninteractive 2021-11-26 10:14:08 +00:00
Andrei Vagin 60456ecc50
buildkite: runsc tests on cgroupv2
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2021-11-26 10:14:08 +00:00
Daniel Dao caf6f8d152
cgroupv2: fix CPUQuota parsing
CPUQuota can return "max PERIOD", in this case, we detect "max"
and return `-1, nil`, which for the current usecase of detecting
cpu-num from quota should be sufficient.

Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2021-11-26 10:14:08 +00:00
Daniel Dao 881a271ff7
runsc: Add cgroup v2 implementation
Adds support for cgroupv2 based on the common cgroup interface.

The cgroupv2 implementation mostly mirrors the structure of cgroupv1,
with many helper functions derived from containerd/cgroups and opencontainers/runc
implementations.  We implemented the following controllers: cpu, cpuset, memory,
io, pids, hugetlb.

In order to avoid upgrading containerd dependency (to get oom poller
implementation), we copied the oom poller implementation for cgroupv2
into shim/oom_v2.go. This requires containerd/cgroups dependency to have
cgroupv2 support which we already have.

Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2021-11-26 10:14:08 +00:00
Daniel Dao 51f39a1204
cgroupv2: skip tests on containerd < 1.4
containerd < 1.4 does not support cgroupv2, so we adjust the Make
targets and installer scripts to skip test run on those versions.

Signed-off-by: Daniel Dao <dqminh89@gmail.com>
2021-11-25 11:55:19 +00:00
Kevin Krakauer 5e984d5aa2 conntrack: account for window scaling
Conntrack was not reading the window scale TCP option and thus could reject
valid packets for being beyond the receive window.

Addresses #6734.

PiperOrigin-RevId: 411932393
2021-11-23 17:43:24 -08:00
Kevin Krakauer 654af2af2e Explicitly pass TCP payload size in conntrack
PiperOrigin-RevId: 411925572
2021-11-23 17:00:15 -08:00
Fabricio Voznika 82fd0523dc Skip readonly controllers
Some system have controller directories created, but they are
read-only. Handle that case and skip optional controllers.

Closes #5887

PiperOrigin-RevId: 411907208
2021-11-23 15:25:50 -08:00
Lucas Manning 2758e11230 Modify udpPacket to hold a PacketBuffer reference instead of a VectorizedView.
PiperOrigin-RevId: 411896048
2021-11-23 14:29:56 -08:00
Kevin Krakauer 2bedb2dc39 mark platforms as Linux-only
Related to #1270.

PiperOrigin-RevId: 411648212
2021-11-22 14:26:07 -08:00
Ayush Ranjan 0fd9b69d5c Add Checked methods to go_marshal.
This is as per proposal in #6450. I have gated this behind a tag because this
is a very sparsely used feature and otherwise will leads to a lot of unused
generated code.

Secondly, we can not generate the CheckUnmarshal method for dynamic types. So
the dynamic tag would now require its users to additionally implement
CheckUnmarshal method which is more cumbersome.

Fixes #6450

PiperOrigin-RevId: 411197734
2021-11-19 20:16:54 -08:00
Adin Scannell 889190828c Skip header install if not available.
PiperOrigin-RevId: 411164318
2021-11-19 16:04:23 -08:00
Kevin Krakauer 91313ede80 buildkite: build Netstack on several platforms
Addresses #6839.

RELNOTES: n/a
PiperOrigin-RevId: 410869145
2021-11-18 12:19:14 -08:00
Adin Scannell f1a46c928f Support STAGED_BINARIES to run prebuilt binaries with the test pipeline.
In some cases, it may be desirable to prebuild binaries and run all tests,
for example to run benchmarks with various experiments. Allow the top-level
Makefile to support this by checking for a STAGED_BINARIES variable.

PiperOrigin-RevId: 410673120
2021-11-17 17:49:35 -08:00
Kevin Krakauer ce194f2c1c Automated rollback of changelist 407638912
PiperOrigin-RevId: 410665707
2021-11-17 17:07:05 -08:00
Ghanan Gowripalan 4ab52f3cfd Drop connection on reply tuple conflict
Updates #6850.

PiperOrigin-RevId: 410368440
2021-11-16 15:44:24 -08:00
Lucas Manning 5117717034 Fix PacketBuffer memory leak.
PiperOrigin-RevId: 410339192
2021-11-16 13:48:30 -08:00
Ghanan Gowripalan 115474bcf3 Always perform NAT in NAT-supported hooks
This avoids a race condition when a packet is being written and the NAT
table is being updated at the same time.

Previously, NAT will only be skipped if either the connection has been
finalized or the hook's relevant NAT (DNAT for Prerouting/Output; SNAT
for Input/Postrouting) has been performed. However, it is possible for
the following sequence of events to occur:

  1) A packet performs DNAT related hooks in Prerouting or Output
     but not perform DNAT as no rule matched the packet.
  2) The NAT table updates such that a DNAT rule now will be performed
     on packets matching the packet's tuple from (1).
  3) A second packet matching the original packet's tuple performs
     the Prerouting or Output hook, now having performed DNAT and
     updating the connection.
  4) Either packet goes through the other hooks and finalizes the
     connection.

Here we would have 2 packets that have the same original tuple but have
different destination address/ports after performing all the hooks.
Later packets will look like the second packet in the example but the
first packet may trigger a response that the connection table will not
recognize, potentially leading to an ICMP error or TCP RST.

A similar race exists for SNAT.

To avoid the race, this change guarantees that {D,S}NAT is always
performed on a connection before leaving the relevant hook. This
way we make sure that all packets that are associated with a
connection will have the same tuple, per direction.

PiperOrigin-RevId: 410338441
2021-11-16 13:43:00 -08:00
Rahat Mahmood 0417b99c16 Replace StubMarshallable with go_marshal's marshal dynamic generator.
FUSE introduced StubMarshallable to avoid boilerplate around types
that were either marshalled in one direction, or were dynamically
sized. The marshal dynamic generator can deal with these cases with a
small bit of stubbing per type.

This also allows FUSE types to be treated as full marshal.Marshallable
types, addressing gVisor.dev/issue/3698.

Closes #3698.

PiperOrigin-RevId: 410335216
2021-11-16 13:31:29 -08:00
Zeling Feng b318556f83 Dockerless PacketImpact
- More hermetic test network setup
- Easier to use/setup without docker

Updates #6018, #6835.

PiperOrigin-RevId: 410039248
2021-11-15 11:44:51 -08:00
Adin Scannell 2857afc5e4 Drop final amutex uses.
These amutex lock uses are limited to vfs1 and provide questionable utility.
They protect only offset access, and not blocking operations. In order to
completely remove amutex, drop these uses. The amutex package will be removed
in a subsequent commit, which migrates other (less questionable) uses to a new
Context API.

PiperOrigin-RevId: 409716979
2021-11-13 15:07:19 -08:00
Adin Scannell 91f58d2cc8 Update Waitable API.
Instead of passing the event mask at registratrion time, pass the mask as part
of the waiter. This makes the mask immutable and simplifies the architecture of
waiters. This is also necessary for a future fix that will allow the fdnotifier
to keep persistent entries, as opposed to requiring constant updates.

This change is intended to be a no-op in terms of function. The only exception
is signalfd, where this mask was abused. To handle this case, the operation of
signalfd changed to allow one layer of indirection.

PiperOrigin-RevId: 409702998
2021-11-13 12:54:39 -08:00