Commit Graph

6707 Commits

Author SHA1 Message Date
Andrei Vagin 2ed45d0964 buildkite: run "Release tests" on amd64
This test fails on arm64:
ERROR: vdso/BUILD:12:8: Executing genrule //vdso:vdso failed: (Exit 1):
bash failed: error executing command /bin/bash -c ...

Use --sandbox_debug to see verbose messages from the sandbox
gcc: error: unrecognized command line option '-m64'
Target //runsc:runsc failed to build
PiperOrigin-RevId: 449398920
2022-05-17 22:50:39 -07:00
Ayush Ranjan f6ed4523dc Reformat codebase.
PiperOrigin-RevId: 449358041
2022-05-17 17:48:35 -07:00
Andrei Vagin 251f2c0561 mm: use lockdep mutexes
PiperOrigin-RevId: 449117356
2022-05-16 20:15:13 -07:00
Andrei Vagin e7508b5ecc buildkite: require kvm for Benchmarks smoke test
PiperOrigin-RevId: 449079133
2022-05-16 16:22:24 -07:00
Andrei Vagin 58e9ba2724 buildkite: don't run docker tests on COS
On COS, buildkite agents are running inside docker containers and so
we can't modify a docker config and restart a docker daemon.

PiperOrigin-RevId: 449063929
2022-05-16 15:11:16 -07:00
Konstantin Bogomolov 3b83c30622 Fix regression in KVM due to metric in machine.Get().
Replaced timer metric with counter, which seems to give similar performance to
removing the metric outright.

PiperOrigin-RevId: 448576866
2022-05-13 14:24:31 -07:00
Ghanan Gowripalan ae508f4064 Track packets dropped by full device TX queue
QDisc/LinkEndpoint may drop packets if the device's send/transmit queue
is full.

BUG: https://fxbug.dev/98974
PiperOrigin-RevId: 448570489
2022-05-13 13:54:14 -07:00
Fabricio Voznika e189fb6886 Add version handshake before communication is stablished
Details on how it works is in wire.Handshake.

Updates #4805

PiperOrigin-RevId: 448552448
2022-05-13 12:33:43 -07:00
Fabricio Voznika fa2a88887d Deprecate --vfs2 flag and always enable by default
Also remove VFS1 dimension from runsc unit tests.

Updates #1624

Startblock:
  after 2022-05-30
PiperOrigin-RevId: 448552271
2022-05-13 12:25:09 -07:00
Nate Hurley add538a44f Run multicast cleanup routine only when needed.
Aside from saving resources, this is also a workaround for the case where
timers are bound after the netstack is initialized.

Updates #7338.

PiperOrigin-RevId: 448509194
2022-05-13 09:14:24 -07:00
Andrei Vagin d44862f672 sentry/socket: use lockdep mutexes
PiperOrigin-RevId: 448382489
2022-05-12 18:18:15 -07:00
Konstantin Bogomolov db16575cb5 KVM machine.Get(): Use up vCPU pool before scanning for available ones.
Previously, we created vCPUs on demand, and so taking an unused vCPU from the
pool was expensive. Now, we pre-create all vCPUs at the start, so it makes
sense to use them all to reduce contention.

PiperOrigin-RevId: 448382485
2022-05-12 18:11:30 -07:00
gVisor bot d4652bd6df Use syscall() for fchmod() calls
Avoids Bionic when running on Android.
See https://android-review.googlesource.com/c/platform/bionic/+/127908

PiperOrigin-RevId: 448378850
2022-05-12 17:48:38 -07:00
Rahat Mahmood 700014c960 cgroupfs: Allow explicit "/" as initial cgroup.
PiperOrigin-RevId: 448361452
2022-05-12 16:22:57 -07:00
Nate Hurley e916e4fde3 Remove the MonotonicTime.Nanoseconds() accessor.
Updates #7338.

PiperOrigin-RevId: 448360987
2022-05-12 16:15:15 -07:00
Rahat Mahmood 7eec8dcf9a mm: Protect mm.dumpability with atomic access instead of mm.metadataMu.
Previously, this was a lock order violation, as mm.metadataMutex ->
mm.mappingRWMutex -> kernel.taskSetRWMutex is required when forking
the task image, and ptrace aquired kernel.taskSetRWMutex ->
mm.metadataMutex to check dumpability.

Ptrace doesn't require a critical section around the use of the
dumpability value.

PiperOrigin-RevId: 448360804
2022-05-12 16:08:33 -07:00
Ayush Ranjan 14c5686d50 Add instructions on how to clean up exclude file for runtime tests.
PiperOrigin-RevId: 448356710
2022-05-12 15:48:49 -07:00
Adin Scannell 47b001caef Set the default queue for the pipeline.
PiperOrigin-RevId: 448341915
2022-05-12 14:41:22 -07:00
Shambhavi Srivastava a7cad2b092 Tmpfs with size option enabled bug fix.
Adding test case to check empty size parsing in Linux.
If a value is not passed with size the mount(2) should return EINVAL.
Handling empty size parsing in gVisor.

PiperOrigin-RevId: 448132149
2022-05-11 18:36:17 -07:00
Adin Scannell 74f3befece Convert pipeline to use common metrics publisher.
PiperOrigin-RevId: 448075984
2022-05-11 13:52:29 -07:00
Ayush Ranjan 1557eb9611 Clean up runtime test exclude file.
I followed the following procedure to generate this change:
- Query the runtime test image to see if the excluded test even exists. Many
  tests were removed from upstream because they were broken or superseded by
  another test. No point in excluding them now, so removed all these entries
  from the exclude file.
- Run the remaining tests with runc. If the test fails with runc (natively)
  then we can consider the test "broken". Such tests do not signal a
  compatibility gap in gVisor. Marked such tests with `Broken test`.
- Run the non-broken tests with runsc (gVisor). If they are now passing, remove
  them from the exclude list. This effectively increases our testing surface.
  More importantly, these tests were once failing with gVisor. Now they pass.
  Something was fixed in between. These tests now act as regression tests for
  the fix.
- Some tests were marked flaky. Before removing them from the exclude list, we
  need to ensure that they really don't flake. I ran some of these flaky tests
  100 times to test for flakiness and only removed them if they passed 100
  times. But maybe some test has a flake rate <1%. Or some flaky tests were not
  marked flaky. So we might end up re-enabling flaky tests. This may cause
  pain. We will need to investigate into such tests when this happens.

PiperOrigin-RevId: 448073470
2022-05-11 13:40:41 -07:00
Kevin Krakauer 9bc06340b2 Describe what iptables checkescape covers specifically
PiperOrigin-RevId: 448037248
2022-05-11 11:07:54 -07:00
Andrei Vagin 3f44cd556b sentry/socket: don't release a connected enpoint under the endpoint mutex
It isn't required and can have side effects. For example, the current endpoint
can be in an SCM message that is queued to the connected endpoint.

PiperOrigin-RevId: 447906621
2022-05-10 22:16:49 -07:00
Etienne Perot 6af4eedc21 runsc: Support `%ID%` substitution in more path flags than just `--debug-log`.
This allows things like CPU profiles to be written out to sandbox-specific
file paths.

PiperOrigin-RevId: 447867091
2022-05-10 17:30:21 -07:00
Fabricio Voznika f62143f31f Ensure that errors from the shim are properly translated
Golang wrapped errors are lost when they go through gRPC. Every
error returned from `task.TaskService` interface should be
translated using `errdefs.ToGRPC`.

Fixes #7504

PiperOrigin-RevId: 447863123
2022-05-10 17:09:00 -07:00
Fabricio Voznika f34e34b3c3 Add runsc trace commands
The trace commands allows a user to manipulate trace sessions.

`runsc trace create <name> --config <file>` => creates a new trace session
`runsc trace delete <name>` => deletes an existing trace session
`runsc trace list` => lists all running trace sessions
`runsc trace metadata` => lists all point with their respective optional and
context fields

This allows trace sessions to be created/deleted on a running sandbox. Note
that the system currently only allows a single trace session to exist, named
'Default'. Attempts to manipulate other sessions will error out.

Updates #4805

PiperOrigin-RevId: 447815153
2022-05-10 13:40:39 -07:00
Arthur Sfez 944b941f9d Allow martian packets on packet socket dgram tests
Before this change, this test fixture expected it to be enabled already
instead of updating the config.

The Teardown method also restore the initial configuration, so that
these tests do not permanently change the config of the machines that
run the tests.

PiperOrigin-RevId: 447814273
2022-05-10 13:38:26 -07:00
Kevin Krakauer f0737cc307 netstack: use checkescape in iptables hot paths
These functions used to allocate even when iptables were disabled. Prevent that
from happening again.

Update to also use gVisor's sync package, as we were using the standard one.

PiperOrigin-RevId: 447812920
2022-05-10 13:31:40 -07:00
Andrei Vagin 3aab92297a Add new locks with the correctness validator
All locks are separated into classes. The validator builds a dependency
graph and checks that it doesn't have cycles.

PiperOrigin-RevId: 447812244
2022-05-10 13:24:51 -07:00
Kevin Krakauer a514a12c09 netstack: reduce allocations
Calls to IPTables.Check* functions were allocating on every single call, even
when IPTables were disabled. Changing from a method pointer to a function
pointer is enough to let the compiler know that nothing escapes and no
allocations are necessary.

PiperOrigin-RevId: 447792280
2022-05-10 12:00:56 -07:00
Etienne Perot 3c0fe6d08d BuildKite: Remove duplicate `jq` from list of packages to install.
It is already listed earlier in the command line.

Ordinarily I wouldn't make a change for something this small, but I'm
going to use this change as a means to dip my toes with the BuildKite
build pipeline.

PiperOrigin-RevId: 447618014
2022-05-09 19:12:44 -07:00
Rahat Mahmood 24f0686ac6 cgroupfs: Set initial cgroup ownership based on initial app uid/gid.
When the init task is specifically placed into some initial cgroup,
sandbox users expect to be able to create cgroupfs dirs as the app
uid/gid.

Previously we default the synthetic directories for the initial cgroup
to 0555, which disallows arbitrary users from creating children.

Add a way to specify the ownership and permissions for the initial
cgroup, and sandbox uses can use these to make the initial cgroup dir
writable by the init task's user.

PiperOrigin-RevId: 447614804
2022-05-09 18:49:20 -07:00
Lucas Manning 04edcf5e6c Replace VectorisedView in link endpoints with pkg/buffer.Buffer.
PiperOrigin-RevId: 447562596
2022-05-09 14:22:08 -07:00
Fabricio Voznika b3609b7167 Fix race in remote_test.go
PiperOrigin-RevId: 447505504
2022-05-09 10:32:40 -07:00
Nicolas Lacasse d5002c6adc Allow creating unix domain sockets on the host, behind a flag.
When enabled with `AllowUDS`, unix domain sockets can be created in the sandbox
and bound on the host filesystem. The application can listen() and accept() on
these sockets as usual. Accept'ed sockets will be donated to the sandbox,
similar to how connect'ed sockets work.

In order to make notifications like poll work, the gofer donates the host-bound
socket FD to the sandbox, but the seccomp filters will (correctly) prevent the
sandbox from calling listen and accept directly on that FD. Instead, listen and
accept calls must go through the gofer. The donated host FD can should only be
used to poll for new incoming connectins.

Note that I changed the order of some of the Lisa RPCs in order to group Bind
with the existing similar Connect method. This changes the RPC numbers in a
backwards-incompatible way, but since nobody is using Lisa yet we are OK. It's
better to make these cleanup changes now before we have users and are locked
in.

PiperOrigin-RevId: 447236441
2022-05-07 18:27:18 -07:00
Rahat Mahmood 409f32743b cgroupfs: Per cgroups(7), PID/TID 0 refers to the current task.
PiperOrigin-RevId: 447090229
2022-05-06 16:06:43 -07:00
Fabricio Voznika 71add37390 Move test server into a separate package
It will be used for tests in other packages.

Updates #4805

PiperOrigin-RevId: 447076300
2022-05-06 14:54:27 -07:00
Lucas Manning 592fc1bb50 Convert fdbased link endpoints to use pkg/buffer instead of VectorizedViews.
PiperOrigin-RevId: 447073885
2022-05-06 14:42:53 -07:00
Ghanan Gowripalan a96653a412 Only return ErrNoBufferSpace if RECVERR is set
...to match Linux behaviour.

PiperOrigin-RevId: 447044373
2022-05-06 12:28:44 -07:00
Konstantin Bogomolov 7574e4f642 Add KVM specific metrics.
This change adds counter and timer metrics useful for analyzing the KVM
platform.

PiperOrigin-RevId: 447043888
2022-05-06 12:21:49 -07:00
Fabricio Voznika 368a4fe8b3 Refactor subcommands error handling into a separate package
This is going to be used by the trace subcommand which lives in
another package.

Updates #4805

PiperOrigin-RevId: 447006075
2022-05-06 09:40:39 -07:00
Fabricio Voznika a23e60af39 Fire clone point for thread creation
Thread creation tracking is required by Falco.

Updates #4805

PiperOrigin-RevId: 447003670
2022-05-06 09:28:44 -07:00
Fabricio Voznika 2d6e64019b Faster proto serialization
The use of protobuf.Any is convenient, but adds to proto serialization
time and number of memory allocations required to send a message.
Instead, we now use an enum to indentify the message and use it to
determine how to unmarshall the message on the receiveing end. It
also speeds up event consuption by not requiring a map from string
(proto names) to callbacks.

BenchmarkProtoAny-6   115.9 ns/op        210 B/op       4 allocs/op
BenchmarkProtoEnum-6   58.3 ns/op          2 B/op       1 allocs/op

Updates #4805

PiperOrigin-RevId: 446879057
2022-05-05 19:29:49 -07:00
gVisor bot 86b29f5074 Use syscall() for XattrWithOPath
Avoids Bionic when running on Android.
See https://android-review.googlesource.com/c/platform/bionic/+/152663.

PiperOrigin-RevId: 446795777
2022-05-05 13:09:46 -07:00
gVisor bot 3b11bfe105 Merge pull request #7391 from zhlhahaha:2477
PiperOrigin-RevId: 446793644
2022-05-05 13:07:01 -07:00
Nayana Bidari 0c54ff1ffe Allow sandbox to start without any tasks.
This is the first set of changes to allow multiple containers in a sandbox.
- Changes to allow kernel.Start() without any tasks.
- New control message to StartContainer() in root namespaces.
- Added new function StartSandbox() to keep the existing behavior separate from
when the multi-containers is enabled.
- Test to verify the new control message with one container.

PiperOrigin-RevId: 446792577
2022-05-05 12:59:36 -07:00
Ghanan Gowripalan bb36c43e97 Respect SO_SNDBUF for network datagram endpoints
Previously, SO_SNDBUF was effectively a no-op. The stack should make
sure that only SO_SNDBUF bytes are ever in-flight for any given
socket/endpoint.

Fuchsia Bug: https://fxbug.dev/99070

PiperOrigin-RevId: 446792223
2022-05-05 12:52:34 -07:00
Nate Hurley 09b7a17066 Cleanup multicast routing table.
The purpose of this change is twofold:

1. Simplify AddInstalledRoute by returning a PacketBuffer slice instead of a
PendingRoute. This obviates PendingRoute.Dequeue and PendingRoute.IsEmpty.
2. Address PacketBuffer lifetime issues. With this change, the routing table
will call PacketBuffer.Clone() if it decides to enqueue the packet. When
AddInstalledRoute is called, the caller will then assume ownership of the
relevant packets and is expected to call PacketBuffer.DecRef() after
forwarding.

Updates #7338.

PiperOrigin-RevId: 446740297
2022-05-05 09:46:49 -07:00
Jamie Liu b86c98c82b Check file permissions before VFS2 overlayfs open.
PiperOrigin-RevId: 446565359
2022-05-04 15:23:31 -07:00
Bhasker Hariharan e89e736f16 Deflake TestCloseRead.
The test needs to wait for CreateEndpoint to return before
cleaning up the stack, otherwise if the netstack is slow to
process the final ACK to the handshake the active side of the
connection can race ahead to the end and start tearing down the
stack.

This results in CreateEndpoint failing with a connection
aborted error.

PiperOrigin-RevId: 446515200
2022-05-04 11:55:10 -07:00