It is already listed earlier in the command line.
Ordinarily I wouldn't make a change for something this small, but I'm
going to use this change as a means to dip my toes with the BuildKite
build pipeline.
PiperOrigin-RevId: 447618014
When the init task is specifically placed into some initial cgroup,
sandbox users expect to be able to create cgroupfs dirs as the app
uid/gid.
Previously we default the synthetic directories for the initial cgroup
to 0555, which disallows arbitrary users from creating children.
Add a way to specify the ownership and permissions for the initial
cgroup, and sandbox uses can use these to make the initial cgroup dir
writable by the init task's user.
PiperOrigin-RevId: 447614804
When enabled with `AllowUDS`, unix domain sockets can be created in the sandbox
and bound on the host filesystem. The application can listen() and accept() on
these sockets as usual. Accept'ed sockets will be donated to the sandbox,
similar to how connect'ed sockets work.
In order to make notifications like poll work, the gofer donates the host-bound
socket FD to the sandbox, but the seccomp filters will (correctly) prevent the
sandbox from calling listen and accept directly on that FD. Instead, listen and
accept calls must go through the gofer. The donated host FD can should only be
used to poll for new incoming connectins.
Note that I changed the order of some of the Lisa RPCs in order to group Bind
with the existing similar Connect method. This changes the RPC numbers in a
backwards-incompatible way, but since nobody is using Lisa yet we are OK. It's
better to make these cleanup changes now before we have users and are locked
in.
PiperOrigin-RevId: 447236441
The use of protobuf.Any is convenient, but adds to proto serialization
time and number of memory allocations required to send a message.
Instead, we now use an enum to indentify the message and use it to
determine how to unmarshall the message on the receiveing end. It
also speeds up event consuption by not requiring a map from string
(proto names) to callbacks.
BenchmarkProtoAny-6 115.9 ns/op 210 B/op 4 allocs/op
BenchmarkProtoEnum-6 58.3 ns/op 2 B/op 1 allocs/op
Updates #4805
PiperOrigin-RevId: 446879057
This is the first set of changes to allow multiple containers in a sandbox.
- Changes to allow kernel.Start() without any tasks.
- New control message to StartContainer() in root namespaces.
- Added new function StartSandbox() to keep the existing behavior separate from
when the multi-containers is enabled.
- Test to verify the new control message with one container.
PiperOrigin-RevId: 446792577
Previously, SO_SNDBUF was effectively a no-op. The stack should make
sure that only SO_SNDBUF bytes are ever in-flight for any given
socket/endpoint.
Fuchsia Bug: https://fxbug.dev/99070
PiperOrigin-RevId: 446792223
The purpose of this change is twofold:
1. Simplify AddInstalledRoute by returning a PacketBuffer slice instead of a
PendingRoute. This obviates PendingRoute.Dequeue and PendingRoute.IsEmpty.
2. Address PacketBuffer lifetime issues. With this change, the routing table
will call PacketBuffer.Clone() if it decides to enqueue the packet. When
AddInstalledRoute is called, the caller will then assume ownership of the
relevant packets and is expected to call PacketBuffer.DecRef() after
forwarding.
Updates #7338.
PiperOrigin-RevId: 446740297
The test needs to wait for CreateEndpoint to return before
cleaning up the stack, otherwise if the netstack is slow to
process the final ACK to the handshake the active side of the
connection can race ahead to the end and start tearing down the
stack.
This results in CreateEndpoint failing with a connection
aborted error.
PiperOrigin-RevId: 446515200
Keepalive timers are cleaned up when the socket is closed or reset. But the
user can still call setsocketopt(SO_KEEPALIVE) on the closed socket, not
checking if the timer is cleaned up will cause panic. This happened when a
server tries to set SO_KEEPALIVE for any incoming connection and there is a
port scanner in the network which resets the connection immediately after the
handshake.
PiperOrigin-RevId: 446080773
- Removed old TODO from VFS1. We will not support this option on VFS1.
- Enhanced comments about how tmpfs pages are accounted for by write(2)s.
PiperOrigin-RevId: 446072121
Added a raw syscall points to all syscalls. Added schematized syscall
points to the following syscalls:
- read
- close
- socket
- connect
- execve
- creat
- openat
- execveat
Updates #4805
PiperOrigin-RevId: 446008358
Before this change FUSE connections were shared 1:1 with FUSE filesystems, which
is incorrect behavior. A FUSE FD should have a 1:1 relationship with a FUSE
connection, and any number of FUSE filesystems can use the same connection.
PiperOrigin-RevId: 445988328
This change simply mimics what's already done for distribution metrics, by
reusing the same fieldMapper to create field-to-integer mappings that are
go:nosplit safe. In this refactor, since all field slots are pre-allocated and
we use atomicbitops for atomically adding/loading uint64 values, it should be
safe to remove the mutex that backed the original single-field implementation.
PiperOrigin-RevId: 445576069
test/syscalls/linux/futex.cc:
In member function 'virtual void gvisor::testing::
{anonymous}::PrivateAndSharedFutexTest_PIWaiters_Test::TestBody()':
test/syscalls/linux/futex.cc:697:19:
warning: comparison of integer expressions of different signedness:
'std::__atomic_base<int>::__int_type' {aka 'int'} and 'unsigned int'
[-Wsign-compare]
697 | while (a.load() != (FUTEX_WAITERS | gettid())) {
| ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test/syscalls/linux/proc_pid_uid_gid_map.cc:207:64:
warning: comparison of integer expressions of different signedness: 'size_t'
{aka 'long unsigned int'} and 'int' [-Wsign-compare]
207 | TEST_PCHECK((n = write(fd, line.c_str(), line.size())) != -1);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~
test/syscalls/linux/socket_inet_loopback.cc:964:21:
warning: comparison of integer expressions of different signedness: 'int' and
'std::array<gvisor::testing::FileDescriptor, 1>::size_type' {aka 'long
unsigned int'} [-Wsign-compare]
964 | for (int i = 0; i < std::size(established_clients); i++) {
| ~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test/syscalls/linux/socket_inet_loopback.cc:974:21:
warning: comparison of integer expressions of different signedness: 'int' and
'std::array<gvisor::testing::FileDescriptor, 1>::size_type' {aka 'long
unsigned int'} [-Wsign-compare]
974 | for (int i = 0; i < std::size(waiting_clients); i++) {
| ~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
test/syscalls/linux/udp_socket.cc:869:6:
warning: suggest explicit braces to avoid ambiguous 'else' [-Wdangling-else]
869 |if (!IsRunningWithHostinet() || GvisorPlatform() == Platform::kPtrace ||
| ^
test/syscalls/linux/socket_unix_unbound_abstract.cc:93:9:
warning: variable 'orig_opts' set but not used [-Wunused-but-set-variable]
93 | int orig_opts;
| ^~~~~~~~~
test/syscalls/linux/socket_unix_unbound_abstract.cc:107:9:
warning: variable 'orig_opts' set but not used [-Wunused-but-set-variable]
107 | int orig_opts;
| ^~~~~~~~~
PiperOrigin-RevId: 445545240
Each syscall provides 4 different points. There is a raw syscall point that
contains the syscall number and all 6 arguments, nothing else. Some syscalls
can provide a schematized version of the syscall by defining a function that
converts the syscall into a proto representing the syscall. Each of these
flavors have a point for enter and another for exit. In both cases, the exit
event adds return value and errno (if any).
Updates #4805
PiperOrigin-RevId: 445510907
The sentry may get SIGPIPE from writing to readerless pipes or shutdown
sockets; in these cases, it's incorrect to forward the SIGPIPE to app PID 1.
(It would be correct to send SIGPIPE to the app thread performing the write
instead, but we can't do so with signal forwarding since (1) we can't identify
the correct app thread to signal and (2) Go signal handling with os/signal is
async - the sentry thread that received SIGPIPE continued execution after doing
so, so the app thread may already be running. Having the sentry send SIGPIPE
independently of the host is gvisor.dev/issue/161.)
Updates #7293
PiperOrigin-RevId: 445469810
Two security messages points are PointCloneProcess and PointExitNotifyParent
(see pkg/sentry/seccheck/seccheck.go).
Both of these should only trigger when we have a process starting or exiting
respectively. Becuase of this, only send a start message if "clone()" is
called without CLONE_THREAD set, and only send an exit message if the thread
group (read: process) is exiting.
PiperOrigin-RevId: 445334328
This led to circular locking since procfs aquires Task.mu while
holding kernfs.Filesystem.mu. The procfs case is harder to break, as
procfs needs to acquire an mm reference during a filesystem operation.
PiperOrigin-RevId: 445237505
In particular, this change adds support for AddInstalledRoute,
RemoveInstalledRoute, and GetLastUsedTimestamp.
Updates #7338.
PiperOrigin-RevId: 445205505