This CL also cleans up the error returned for setting congestion
control which was incorrectly returning EINVAL instead of ENOENT.
PiperOrigin-RevId: 252889093
For sendfile(2), we propagate a TCP error through the system call layer.
This should be eaten if there is a partial result. This change also adds
a test to ensure that there is no panic in this case, for both TCP sockets
and unix domain sockets.
PiperOrigin-RevId: 252746192
Changes netstack to confirm to current linux behaviour where if the backlog is
full then we drop the SYN and do not send a SYN-ACK. Similarly we allow upto
backlog connections to be in SYN-RCVD state as long as the backlog is not full.
We also now drop a SYN if syn cookies are in use and the backlog for the
listening endpoint is full.
Added new tests to confirm the behaviour.
Also reverted the change to increase the backlog in TcpPortReuseMultiThread
syscall test.
Fixes#236
PiperOrigin-RevId: 252500462
We still only advertise a single NUMA node, and ignore mempolicy
accordingly, but mbind() at least now succeeds and has effects reflected
by get_mempolicy().
Also fix handling of nodemasks: round sizes to unsigned long (as
documented and done by Linux), and zero trailing bits when copying them
out.
PiperOrigin-RevId: 251950859
This is necessary for implementing network diagnostic interfaces like
/proc/net/{tcp,udp,unix} and sock_diag(7).
For pass-through endpoints such as hostinet, we obtain the socket
state from the backend. For netstack, we add explicit tracking of TCP
states.
PiperOrigin-RevId: 251934850
This is required to make the shutdown visible to peers outside the
sandbox.
The readClosed / writeClosed fields were dropped, as they were
preventing a shutdown socket from reading the remainder of queued bytes.
The host syscalls will return the appropriate errors for shutdown.
The control message tests have been split out of socket_unix.cc to make
the (few) remaining tests accessible to testing inherited host UDS,
which don't support sending control messages.
Updates #273
PiperOrigin-RevId: 251763060
Multicast packets are special in that their destination address does not
identify a specific interface. When sending out such a packet the multicast
address is the remote address, but for incoming packets it is the local
address. Hence, when looping a multicast packet, the route needs to be
tweaked to reflect this.
PiperOrigin-RevId: 251739298
We don't actually support core dumps, but some applications want to
get/set dumpability, which still has an effect in procfs.
Lack of support for set-uid binaries or fs creds simplifies things a
bit.
As-is, processes started via CreateProcess (i.e., init and sentryctl
exec) have normal dumpability. I'm a bit torn on whether sentryctl exec
tasks should be dumpable, but at least since they have no parent normal
UID/GID checks should protect them.
PiperOrigin-RevId: 251712714
VmData is the size of private data segments.
It has the same meaning as in Linux.
Change-Id: Iebf1ae85940a810524a6cde9c2e767d4233ddb2a
PiperOrigin-RevId: 250593739
After bf959931ddb88c4e4366e96dd22e68fa0db9527c ("wait/ptrace: assume
__WALL if the child is traced") (Linux 4.7), tracees are always eligible
for waiting, regardless of type.
PiperOrigin-RevId: 250399527
The previous commit adds WNOTHREAD support to waitid, so we may as well
complete the upstream change.
Linux added WCLONE, WALL, WNOTHREAD support to waitid(2) in
91c4e8ea8f05916df0c8a6f383508ac7c9e10dba ("wait: allow sys_waitid() to
accept __WNOTHREAD/__WCLONE/__WALL"). i.e., Linux 4.7.
PiperOrigin-RevId: 249560587
Change-Id: Iff177b0848a3f7bae6cb5592e44500c5a942fbeb
Pipe internals are made more efficient by avoiding garbage collection.
A pool is now used that can be shared by all pipes, and buffers are
chained via an intrusive list. The documentation for pipe structures
and methods is also simplified and clarified.
The pipe tests are now parameterized, so that they are run on all
different variants (named pipes, small buffers, default buffers).
The pipe buffer sizes are exposed by fcntl, which is now supported
by this change. A size change test has been added to the suite.
These new tests uncovered a bug regarding the semantics of open
named pipes with O_NONBLOCK, which is also fixed by this CL. This
fix also addresses the lack of the O_LARGEFILE flag for named pipes.
PiperOrigin-RevId: 249375888
Change-Id: I48e61e9c868aedb0cadda2dff33f09a560dee773
* A segment with filesz == 0, memsz > 0 should be an anonymous only
mapping. We were failing to load such an ELF.
* Anonymous pages are always mapped RW, regardless of the segment
protections.
PiperOrigin-RevId: 249355239
Change-Id: I251e5c0ce8848cf8420c3aadf337b0d77b1ad991
This does not actually implement an efficient splice or sendfile. Rather, it
adds a generic plumbing to the file internals so that this can be added. All
file implementations use the stub fileutil.NoSplice implementation, which
causes sendfile and splice to fall back to an internal copy.
A basic splice system call interface is added, along with a test.
PiperOrigin-RevId: 249335960
Change-Id: Ic5568be2af0a505c19e7aec66d5af2480ab0939b
* Creation of files, directories (and other fs objects) in a directory
should always update ctime.
* Same for removal.
* atime should not be updated on lookup, only readdir.
I've also renamed some misleading functions that update mtime and ctime.
PiperOrigin-RevId: 249115063
Change-Id: I30fa275fa7db96d01aa759ed64628c18bb3a7dc7
There is a lot of redundancy that we can simplify in the stat_times
test. This will make it easier to add new tests. However, the
simplification reveals that cached uattrs on goferfs don't properly
update ctime on rename.
PiperOrigin-RevId: 248773425
Change-Id: I52662728e1e9920981555881f9a85f9ce04041cf
The issue with duplicate /proc/sys entries seems to have been fixed in:
PiperOrigin-RevId 229305982
Git hash dc8450b567Fixesgoogle/gvisor#125
PiperOrigin-RevId: 248571903
Change-Id: I76ff3b525c93dafb92da6e5cf56e440187f14579
Some behavior was broken due to the difficulty of running automated raw
socket tests.
Change-Id: I152ca53916bb24a0208f2dc1c4f5bc87f4724ff6
PiperOrigin-RevId: 246747067
bazel has a lot of dependencies and users don't want to install them
just to build gvisor.
These changes allows to run bazel in a docker container.
A bazel cache is on the local file system (~/.cache/bazel), so
incremental builds should be fast event after recreating a bazel
container.
Here is an example how to build runsc:
make BAZEL_OPTIONS="build runsc:runsc" bazel
Change-Id: I8c0a6d0c30e835892377fb6dd5f4af7a0052d12a
PiperOrigin-RevId: 246570877
The test also times out when GCE machine has 2 CPUs. I cannot
repro it locally with a 2 CPU cgroup though. Let's skip the
test when there are 2 CPUs to stop the flakiness and retest it
once the fix is available.
PiperOrigin-RevId: 246523363
Change-Id: I9d9d922a5be3aa7bc91dff5a1807ca99f3f4a4f9
Fixed a small logic error that broke proper accounting of MultiPortEndpoints.
PiperOrigin-RevId: 246502126
Change-Id: I1a7d6ea134f811612e545676212899a3707bc2c2
This requires two changes:
1) Support for more than one socket to join a given multicast group.
2) Duplicate delivery of incoming multicast packets to all sockets listening
for it.
In addition, I tweaked the code (and added a test) to disallow duplicates
IP_ADD_MEMBERSHIP calls for the same group and NIC. This is how Linux does
it.
PiperOrigin-RevId: 246437315
Change-Id: Icad8300b4a8c3f501d9b4cd283bd3beabef88b72
Test times out when it runs on a single core. Skip until the
bug in the Go runtime is fixed.
PiperOrigin-RevId: 245866466
Change-Id: Ic3e72131c27136d58b71f6b11acc78abf55895d4
Based on the guidelines at
https://opensource.google.com/docs/releasing/authors/.
1. $ rg -l "Google LLC" | xargs sed -i 's/Google LLC.*/The gVisor Authors./'
2. Manual fixup of "Google Inc" references.
3. Add AUTHORS file. Authors may request to be added to this file.
4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS.
Fixes#209
PiperOrigin-RevId: 245823212
Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
Previously, createAt was eating all errors from FindInode except for EACCES and
proceeding with the creation. This is incorrect, as FindInode can return many
other errors (like ENAMETOOLONG) that should stop creation.
This CL changes createAt to return all errors encountered except for ENOENT,
which we can ignore because we are about to create the thing.
PiperOrigin-RevId: 245773222
Change-Id: I1b317021de70f0550fb865506f6d8147d4aebc56