Commit Graph

6677 Commits

Author SHA1 Message Date
Etienne Perot 3c0fe6d08d BuildKite: Remove duplicate `jq` from list of packages to install.
It is already listed earlier in the command line.

Ordinarily I wouldn't make a change for something this small, but I'm
going to use this change as a means to dip my toes with the BuildKite
build pipeline.

PiperOrigin-RevId: 447618014
2022-05-09 19:12:44 -07:00
Rahat Mahmood 24f0686ac6 cgroupfs: Set initial cgroup ownership based on initial app uid/gid.
When the init task is specifically placed into some initial cgroup,
sandbox users expect to be able to create cgroupfs dirs as the app
uid/gid.

Previously we default the synthetic directories for the initial cgroup
to 0555, which disallows arbitrary users from creating children.

Add a way to specify the ownership and permissions for the initial
cgroup, and sandbox uses can use these to make the initial cgroup dir
writable by the init task's user.

PiperOrigin-RevId: 447614804
2022-05-09 18:49:20 -07:00
Lucas Manning 04edcf5e6c Replace VectorisedView in link endpoints with pkg/buffer.Buffer.
PiperOrigin-RevId: 447562596
2022-05-09 14:22:08 -07:00
Fabricio Voznika b3609b7167 Fix race in remote_test.go
PiperOrigin-RevId: 447505504
2022-05-09 10:32:40 -07:00
Nicolas Lacasse d5002c6adc Allow creating unix domain sockets on the host, behind a flag.
When enabled with `AllowUDS`, unix domain sockets can be created in the sandbox
and bound on the host filesystem. The application can listen() and accept() on
these sockets as usual. Accept'ed sockets will be donated to the sandbox,
similar to how connect'ed sockets work.

In order to make notifications like poll work, the gofer donates the host-bound
socket FD to the sandbox, but the seccomp filters will (correctly) prevent the
sandbox from calling listen and accept directly on that FD. Instead, listen and
accept calls must go through the gofer. The donated host FD can should only be
used to poll for new incoming connectins.

Note that I changed the order of some of the Lisa RPCs in order to group Bind
with the existing similar Connect method. This changes the RPC numbers in a
backwards-incompatible way, but since nobody is using Lisa yet we are OK. It's
better to make these cleanup changes now before we have users and are locked
in.

PiperOrigin-RevId: 447236441
2022-05-07 18:27:18 -07:00
Rahat Mahmood 409f32743b cgroupfs: Per cgroups(7), PID/TID 0 refers to the current task.
PiperOrigin-RevId: 447090229
2022-05-06 16:06:43 -07:00
Fabricio Voznika 71add37390 Move test server into a separate package
It will be used for tests in other packages.

Updates #4805

PiperOrigin-RevId: 447076300
2022-05-06 14:54:27 -07:00
Lucas Manning 592fc1bb50 Convert fdbased link endpoints to use pkg/buffer instead of VectorizedViews.
PiperOrigin-RevId: 447073885
2022-05-06 14:42:53 -07:00
Ghanan Gowripalan a96653a412 Only return ErrNoBufferSpace if RECVERR is set
...to match Linux behaviour.

PiperOrigin-RevId: 447044373
2022-05-06 12:28:44 -07:00
Konstantin Bogomolov 7574e4f642 Add KVM specific metrics.
This change adds counter and timer metrics useful for analyzing the KVM
platform.

PiperOrigin-RevId: 447043888
2022-05-06 12:21:49 -07:00
Fabricio Voznika 368a4fe8b3 Refactor subcommands error handling into a separate package
This is going to be used by the trace subcommand which lives in
another package.

Updates #4805

PiperOrigin-RevId: 447006075
2022-05-06 09:40:39 -07:00
Fabricio Voznika a23e60af39 Fire clone point for thread creation
Thread creation tracking is required by Falco.

Updates #4805

PiperOrigin-RevId: 447003670
2022-05-06 09:28:44 -07:00
Fabricio Voznika 2d6e64019b Faster proto serialization
The use of protobuf.Any is convenient, but adds to proto serialization
time and number of memory allocations required to send a message.
Instead, we now use an enum to indentify the message and use it to
determine how to unmarshall the message on the receiveing end. It
also speeds up event consuption by not requiring a map from string
(proto names) to callbacks.

BenchmarkProtoAny-6   115.9 ns/op        210 B/op       4 allocs/op
BenchmarkProtoEnum-6   58.3 ns/op          2 B/op       1 allocs/op

Updates #4805

PiperOrigin-RevId: 446879057
2022-05-05 19:29:49 -07:00
gVisor bot 86b29f5074 Use syscall() for XattrWithOPath
Avoids Bionic when running on Android.
See https://android-review.googlesource.com/c/platform/bionic/+/152663.

PiperOrigin-RevId: 446795777
2022-05-05 13:09:46 -07:00
gVisor bot 3b11bfe105 Merge pull request #7391 from zhlhahaha:2477
PiperOrigin-RevId: 446793644
2022-05-05 13:07:01 -07:00
Nayana Bidari 0c54ff1ffe Allow sandbox to start without any tasks.
This is the first set of changes to allow multiple containers in a sandbox.
- Changes to allow kernel.Start() without any tasks.
- New control message to StartContainer() in root namespaces.
- Added new function StartSandbox() to keep the existing behavior separate from
when the multi-containers is enabled.
- Test to verify the new control message with one container.

PiperOrigin-RevId: 446792577
2022-05-05 12:59:36 -07:00
Ghanan Gowripalan bb36c43e97 Respect SO_SNDBUF for network datagram endpoints
Previously, SO_SNDBUF was effectively a no-op. The stack should make
sure that only SO_SNDBUF bytes are ever in-flight for any given
socket/endpoint.

Fuchsia Bug: https://fxbug.dev/99070

PiperOrigin-RevId: 446792223
2022-05-05 12:52:34 -07:00
Nate Hurley 09b7a17066 Cleanup multicast routing table.
The purpose of this change is twofold:

1. Simplify AddInstalledRoute by returning a PacketBuffer slice instead of a
PendingRoute. This obviates PendingRoute.Dequeue and PendingRoute.IsEmpty.
2. Address PacketBuffer lifetime issues. With this change, the routing table
will call PacketBuffer.Clone() if it decides to enqueue the packet. When
AddInstalledRoute is called, the caller will then assume ownership of the
relevant packets and is expected to call PacketBuffer.DecRef() after
forwarding.

Updates #7338.

PiperOrigin-RevId: 446740297
2022-05-05 09:46:49 -07:00
Jamie Liu b86c98c82b Check file permissions before VFS2 overlayfs open.
PiperOrigin-RevId: 446565359
2022-05-04 15:23:31 -07:00
Bhasker Hariharan e89e736f16 Deflake TestCloseRead.
The test needs to wait for CreateEndpoint to return before
cleaning up the stack, otherwise if the netstack is slow to
process the final ACK to the handshake the active side of the
connection can race ahead to the end and start tearing down the
stack.

This results in CreateEndpoint failing with a connection
aborted error.

PiperOrigin-RevId: 446515200
2022-05-04 11:55:10 -07:00
Ghanan Gowripalan da0c67b92a Use different flags for IPV6_RECVERR and IP_RECVERR
PiperOrigin-RevId: 446361103
2022-05-03 21:17:40 -07:00
gVisor bot 13cc10bc0b Implement a simple prefix-Trie structure for storing arbitrary payloads
PiperOrigin-RevId: 446284709
2022-05-03 14:09:32 -07:00
gVisor bot 6077c1cbf1 Internal change.
PiperOrigin-RevId: 446281996
2022-05-03 13:57:56 -07:00
gVisor bot 0025af532e Merge pull request #6608 from kevinGC:hostinet-timeout
PiperOrigin-RevId: 446211347
2022-05-03 09:24:53 -07:00
Lucas Manning ca016724dc Automated rollback of changelist 444723128
PiperOrigin-RevId: 446102766
2022-05-02 21:14:55 -07:00
Zeling Feng f728041258 Do not enable keepalive timer if it was cleaned up
Keepalive timers are cleaned up when the socket is closed or reset. But the
user can still call setsocketopt(SO_KEEPALIVE) on the closed socket, not
checking if the timer is cleaned up will cause panic. This happened when a
server tries to set SO_KEEPALIVE for any incoming connection and there is a
port scanner in the network which resets the connection immediately after the
handshake.

PiperOrigin-RevId: 446080773
2022-05-02 18:30:44 -07:00
Ayush Ranjan ea45573148 Update comments about tmpfs size.
- Removed old TODO from VFS1. We will not support this option on VFS1.
- Enhanced comments about how tmpfs pages are accounted for by write(2)s.

PiperOrigin-RevId: 446072121
2022-05-02 17:39:26 -07:00
Kevin Krakauer 88b2fc0942 hostinet: allow getsockopt(SO_RCVTIMEO) and getsockopt(SO_SNDTIMEO)
Fixes #6603
2022-05-02 15:41:14 -07:00
Zeling Feng 4ee0a226fe Remove unnecessary cleanupLocked
handshakeFailed already calls cleanupLocked at the end, there's no need to
cleanup twice the same endpoint.

PiperOrigin-RevId: 446043917
2022-05-02 15:29:45 -07:00
Fabricio Voznika f2b6fbb47e Add Points to some syscalls
Added a raw syscall points to all syscalls. Added schematized syscall
points to the following syscalls:

  - read
  - close
  - socket
  - connect
  - execve
  - creat
  - openat
  - execveat

Updates #4805

PiperOrigin-RevId: 446008358
2022-05-02 13:03:04 -07:00
Adin Scannell 87180c225b Stop clobbering the signed packages.
Fixes #5635

PiperOrigin-RevId: 446002436
2022-05-02 12:37:18 -07:00
Lucas Manning 32c474d82f Allow multiple FUSE filesystems to share a connection.
Before this change FUSE connections were shared 1:1 with FUSE filesystems, which
is incorrect behavior. A FUSE FD should have a 1:1 relationship with a FUSE
connection, and any number of FUSE filesystems can use the same connection.

PiperOrigin-RevId: 445988328
2022-05-02 11:41:42 -07:00
Fabricio Voznika 3b269001eb Add container/start context fields
Updates #4805

PiperOrigin-RevId: 445976770
2022-05-02 11:01:08 -07:00
Konstantin Bogomolov bd1dbb001c Add support for more than one field for uint64 metrics.
This change simply mimics what's already done for distribution metrics, by
reusing the same fieldMapper to create field-to-integer mappings that are
go:nosplit safe. In this refactor, since all field slots are pre-allocated and
we use atomicbitops for atomically adding/loading uint64 values, it should be
safe to remove the mutex that backed the original single-field implementation.
PiperOrigin-RevId: 445576069
2022-04-29 21:48:13 -07:00
Andrei Vagin b286d1c1ba tests: fix compile time warnings
test/syscalls/linux/futex.cc:
In member function 'virtual void gvisor::testing::
{anonymous}::PrivateAndSharedFutexTest_PIWaiters_Test::TestBody()':

test/syscalls/linux/futex.cc:697:19:
warning: comparison of integer expressions of different signedness:
'std::__atomic_base<int>::__int_type' {aka 'int'} and 'unsigned int'
[-Wsign-compare]

  697 |   while (a.load() != (FUTEX_WAITERS | gettid())) {
      |          ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

test/syscalls/linux/proc_pid_uid_gid_map.cc:207:64:
warning: comparison of integer expressions of different signedness: 'size_t'
{aka 'long unsigned int'} and 'int' [-Wsign-compare]

  207 |         TEST_PCHECK((n = write(fd, line.c_str(), line.size())) != -1);
      |                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~
test/syscalls/linux/socket_inet_loopback.cc:964:21:
warning: comparison of integer expressions of different signedness: 'int' and
'std::array<gvisor::testing::FileDescriptor, 1>::size_type' {aka 'long
unsigned int'} [-Wsign-compare]

  964 |   for (int i = 0; i < std::size(established_clients); i++) {
      |                   ~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

test/syscalls/linux/socket_inet_loopback.cc:974:21:
warning: comparison of integer expressions of different signedness: 'int' and
'std::array<gvisor::testing::FileDescriptor, 1>::size_type' {aka 'long
unsigned int'} [-Wsign-compare]

  974 |   for (int i = 0; i < std::size(waiting_clients); i++) {
      |                   ~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~

test/syscalls/linux/udp_socket.cc:869:6:
warning: suggest explicit braces to avoid ambiguous 'else' [-Wdangling-else]

  869 |if (!IsRunningWithHostinet() || GvisorPlatform() == Platform::kPtrace ||
      |   ^

test/syscalls/linux/socket_unix_unbound_abstract.cc:93:9:
warning: variable 'orig_opts' set but not used [-Wunused-but-set-variable]

   93 |     int orig_opts;
      |         ^~~~~~~~~

test/syscalls/linux/socket_unix_unbound_abstract.cc:107:9:
warning: variable 'orig_opts' set but not used [-Wunused-but-set-variable]

  107 |     int orig_opts;
      |         ^~~~~~~~~

PiperOrigin-RevId: 445545240
2022-04-29 17:59:14 -07:00
Bhasker Hariharan 2b0646de86 Move datagram_test to transport_test package.
Fixes #7392

PiperOrigin-RevId: 445542709
2022-04-29 17:42:40 -07:00
Bhasker Hariharan 19e9ee104a Bump default gonet.ListenTCP backlog to 4096.
Fixes #7379

PiperOrigin-RevId: 445541836
2022-04-29 17:40:17 -07:00
Bhasker Hariharan 2d9f1c3329 Add test to verify zero linger behavior.
PiperOrigin-RevId: 445541797
2022-04-29 17:33:21 -07:00
Adin Scannell efdf8dd715 Don't push images or the website on tag builds.
Updates #7327

PiperOrigin-RevId: 445535035
2022-04-29 16:51:41 -07:00
Fabricio Voznika 575d76def2 Add support for syscall points
Each syscall provides 4 different points. There is a raw syscall point that
contains the syscall number and all 6 arguments, nothing else. Some syscalls
can provide a schematized version of the syscall by defining a function that
converts the syscall into a proto representing the syscall. Each of these
flavors have a point for enter and another for exit. In both cases, the exit
event adds return value and errno (if any).

Updates #4805

PiperOrigin-RevId: 445510907
2022-04-29 14:49:40 -07:00
Jamie Liu f9afde9b88 Exempt SIGPIPE from sentry signal forwarding.
The sentry may get SIGPIPE from writing to readerless pipes or shutdown
sockets; in these cases, it's incorrect to forward the SIGPIPE to app PID 1.

(It would be correct to send SIGPIPE to the app thread performing the write
instead, but we can't do so with signal forwarding since (1) we can't identify
the correct app thread to signal and (2) Go signal handling with os/signal is
async - the sentry thread that received SIGPIPE continued execution after doing
so, so the app thread may already be running. Having the sentry send SIGPIPE
independently of the host is gvisor.dev/issue/161.)

Updates #7293

PiperOrigin-RevId: 445469810
2022-04-29 11:46:33 -07:00
Fabricio Voznika a1aa00f922 Remove TID from task start
This was used to map kernel.Task to goroutines before
kernel.Task.GoroutineID was available. It's not needed anymore.

PiperOrigin-RevId: 445452190
2022-04-29 10:32:08 -07:00
Zach Koopmans 76023bd2ad Don't send security events for clone/exit tasks that are threads.
Two security messages points are PointCloneProcess and PointExitNotifyParent
(see pkg/sentry/seccheck/seccheck.go).
Both of these should only trigger when we have a process starting or exiting
respectively. Becuase of this, only send a start message if "clone()" is
called without CLONE_THREAD set, and only send an exit message if the thread
group (read: process) is exiting.

PiperOrigin-RevId: 445334328
2022-04-28 23:01:19 -07:00
Ayush Ranjan 71764f1b8c Generalize lisafs testsuite for external users.
PiperOrigin-RevId: 445300994
2022-04-28 18:47:54 -07:00
Rahat Mahmood 47b5915a7b Break Task.mu -> kernfs.Filesystem.mu lock chain when managing cgroups.
This led to circular locking since procfs aquires Task.mu while
holding kernfs.Filesystem.mu. The procfs case is harder to break, as
procfs needs to acquire an mm reference during a filesystem operation.

PiperOrigin-RevId: 445237505
2022-04-28 13:45:10 -07:00
Nate Hurley ef4a490693 Expire pending multicast packets/routes.
Updates #7338.

PiperOrigin-RevId: 445231315
2022-04-28 13:21:12 -07:00
Fabricio Voznika e1c4bbccf9 Add sentry/task_exit point
Updates #4805

PiperOrigin-RevId: 445222912
2022-04-28 12:45:48 -07:00
Nate Hurley 9f41bb6d62 Implement remaining multicast routing table operations.
In particular, this change adds support for AddInstalledRoute,
RemoveInstalledRoute, and GetLastUsedTimestamp.

Updates #7338.

PiperOrigin-RevId: 445205505
2022-04-28 11:34:51 -07:00
Fabricio Voznika 548d127739 Refactor code to use seccheck.SendToCheckers
Updates #4805

PiperOrigin-RevId: 445017536
2022-04-27 18:04:14 -07:00
Kevin Krakauer 21e95c8a1c make checkaligned error more helpful by printing type name to use
PiperOrigin-RevId: 445007578
2022-04-27 17:11:03 -07:00