Commit Graph

5855 Commits

Author SHA1 Message Date
Ghanan Gowripalan 3b4bb94751 Add loopback interface as an ethernet-based device
...to match Linux behaviour.

We can see evidence of Linux representing loopback as an ethernet-based
device below:
```
# EUI-48 based MAC addresses.
$ ip link show lo
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

# tcpdump showing ethernet frames when sniffing loopback and logging the
# link-type as EN10MB (Ethernet).
$ sudo tcpdump -i lo -e -c 2 -n
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on lo, link-type EN10MB (Ethernet), snapshot length 262144 bytes
03:09:05.002034 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 66: 127.0.0.1.9557 > 127.0.0.1.36828: Flags [.], ack 3562800815, win 15342, options [nop,nop,TS val 843174495 ecr 843159493], length 0
03:09:05.002094 00:00:00:00:00:00 > 00:00:00:00:00:00, ethertype IPv4 (0x0800), length 66: 127.0.0.1.36828 > 127.0.0.1.9557: Flags [.], ack 1, win 6160, options [nop,nop,TS val 843174496 ecr 843159493], length 0
2 packets captured
116 packets received by filter
0 packets dropped by kernel
```

Wireshark shows a similar result as the tcpdump example above.

Linux's loopback setup: 5bfc75d92e/drivers/net/loopback.c (L162)

PiperOrigin-RevId: 391836719
2021-08-19 13:54:53 -07:00
Zeling Feng 50ed6b2e09 Use a hash function to generate tcp timestamp offset
Also fix an option parsing error in checker.TCPTimestampChecker while I am here.

PiperOrigin-RevId: 391828329
2021-08-19 13:15:40 -07:00
Zeling Feng a4ae5fed32 Split TCP secrets from Stack to tcp.protocol
Use different secrets for different purposes (port picking,
ISN generation, tsOffset generation) and moved the secrets
from stack.Stack to tcp.protocol.

PiperOrigin-RevId: 391641238
2021-08-18 17:00:13 -07:00
Chong Cai 75b5a4f455 Add control configs
Also plumber the controls through runsc

PiperOrigin-RevId: 391594318
2021-08-18 13:13:49 -07:00
Michael Pratt e0bf522502 Declare default outputs from nogo_stdlib
nogo_stdlib propogate facts and findings to downstream nogo_aspects via
NogoStdlibInfo. This all works fine except one case: directly building a
nogo_stdlib. e.g., bazel build //tools/nogo:stdlib.

In this case, nothing is requesting the NogoStdlibInfo, and thus the target has
nothing to do. This can be rather confusing when trying to debug failures in
:stdlib, as building :stdlib does nothing.

Fix this by declaring the facts and findings as default outputs from
nogo_stdlib. Now direct bazel build will request these outputs and actually
trigger the analysis. Standard aspect builds are unaffected.

PiperOrigin-RevId: 391580126
2021-08-18 12:10:09 -07:00
Ayush Ranjan 216b740663 [op] Deflake SNMP Metric proc_net tests.
Earlier the tests were checking for equality of system-wide metrics before and
after some network related operations. That is inherently racy for native tests
because depending on the testing infrastructure, multiple tests might run
parallely hence trampling over each other's metrics.

Tests should only compare metrics that are increasing in nature. The comparison
should not be a hard comparison, instead a less-than/greater-than relation test.

I have changed the checks and also removed tests for tcpCurrEstab metric which
has "SYNTAX  Gauge" and hence can not be tested reliably.

PiperOrigin-RevId: 391460081
2021-08-17 23:37:29 -07:00
gVisor bot b495ae599a Merge pull request #6262 from sudo-sturbia:msgqueue/syscalls3
PiperOrigin-RevId: 391416650
2021-08-17 17:44:26 -07:00
Andrei Vagin 8f6c54c8c0 Deflake test/perf:randread_benchmark
The test expects that pread reads the full buffer, it means that the pread
offset has to be equal or less than file_size - buffer_size.

PiperOrigin-RevId: 391356863
2021-08-17 12:58:37 -07:00
Zyad A. Ali 2f1c65e7fa Implement stub for msgctl(2).
Add support for msgctl and enable tests.

Fixes #135
2021-08-17 20:34:51 +02:00
Zyad A. Ali 265deee8cb Implement control operations on msgqueue.
For IPCInfo, update value of MSGSEG constant in abi to avoid overflow in
MsgInfo.MsgSeg. MSGSEG was originaly simplified in abi, and is unused
(by us and within the kernel), so updating it is okay.

Updates #135
2021-08-17 20:31:38 +02:00
Zyad A. Ali 2cf61eab4a Implement ipc.Object.Set and use it in ipc mechanisms.
Set provides functionality of {sem,shm,msg}ctl(IPC_SET).
2021-08-17 20:31:38 +02:00
Zyad A. Ali 122fd928f9 Add tests for msgctl(2).
Updates #135
2021-08-17 20:31:32 +02:00
gVisor bot ebf76b30cb Internal change.
PiperOrigin-RevId: 391331401
2021-08-17 11:05:59 -07:00
gVisor bot fa32136ac0 Internal change.
PiperOrigin-RevId: 391217339
2021-08-16 23:29:11 -07:00
Andrei Vagin 6294a7a6ec test/syscalls/proc_net: /proc/net/snmp can contain system-wide statistics
This is a new kernel feature that are controlled by the net.core.mibs_allocation
sysctl.

PiperOrigin-RevId: 391215784
2021-08-16 23:14:40 -07:00
Andrei Vagin bb13d015a4 imges/syzkaller: add --allow-releaseinfo-change to apt update
Otherwise, it fails with this error:
Get:3 http://security.debian.org/debian-security buster/updates InRelease
Reading package lists...
E: Repository 'http://deb.debian.org/debian buster InRelease' changed its
'Suite' value from 'stable' to 'oldstable'
PiperOrigin-RevId: 391155532
2021-08-16 15:53:28 -07:00
Zach Koopmans ce58d71fd5 [syserror] Remove pkg syserror.
Removes package syserror and moves still relevant code to either linuxerr
or to syserr (to be later removed).

Internal errors are converted from random types to *errors.Error types used
in linuxerr. Internal errors are in linuxerr/internal.go.

PiperOrigin-RevId: 390724202
2021-08-13 17:16:52 -07:00
Zach Koopmans 868ed0e807 [benchmarks] Update BenchmarkStartEmpty benchmark.
Update the start benchmark on empty to only "Start" a container, not wait
for its completion.

TL:DR only measure the actual start call for the empty container.

Previously, we were measuring the completion of /bin/true in container
alpine AND the cleanup. This was fine until profiling started failing all
the time on ptrace. This is a cost that runc is not paying.

These changes will reduce the over all timing of the benchmark, but it will
give more sane results.

Instead, use "Spawn" which is similar to `docker run --detach alpine
/bin/sleep 100`. Call sleep so containers stick around long enough
for the profiler to read profile data from them.

PiperOrigin-RevId: 390705431
2021-08-13 15:29:11 -07:00
Chong Cai 6eb8596f72 Add Event controls
Add Event controls and implement "stream" commands.

PiperOrigin-RevId: 390691702
2021-08-13 14:20:12 -07:00
Michael Pratt a7b59445db Fix minor typo
PiperOrigin-RevId: 390659097
2021-08-13 11:46:59 -07:00
Michael Pratt ed89602161 Disable SA1019 (deprecation check)
On Go tip (pre-1.18), http://golang.org/issue/44195 is making SA1019 mistake
uses of reflect.Value.Len for reflect.Value.InterfaceData, which is deprecated.
It is thus mistakenly raising deprecation errors on uses of reflect.Value.Len.

Suppress these errors by disabling SA1019 entirely. This is a bit overkill, but
it is unclear to me if we want hard errors on deprecation anyways. That can be
reevaluated when http://golang.org/issue/44195 is fixed.

The other staticcheck analyzers are moved to alphabetical order.

Updates golang/go#44195

PiperOrigin-RevId: 390655918
2021-08-13 11:31:55 -07:00
Michael Pratt 8f2b11a87e Update `core` allowed dependencies
This list has gotten a little out-of-date. Note that `clockwork` used to be
used but was removed in gvisor.dev/pr/5384.

PiperOrigin-RevId: 390644841
2021-08-13 10:43:50 -07:00
Ghanan Gowripalan eb0f24c6c4 Free multicastMemberships on UDP endpoint close
tcpip.Endpoint.Close is documented to free all resources associated
with an endpoint so we don't need to create an empty map to clear
the multicast memberships.

PiperOrigin-RevId: 390609826
2021-08-13 07:42:22 -07:00
Chong Cai ddcf884e9d Add Usage controls
Add Usage controls and implement "usage/usagefd" commands.

PiperOrigin-RevId: 390507423
2021-08-12 18:32:01 -07:00
Zach Koopmans 02370bbd31 [syserror] Convert remaining syserror definitions to linuxerr.
Convert remaining public errors (e.g. EINTR) from syserror to linuxerr.

PiperOrigin-RevId: 390471763
2021-08-12 15:19:12 -07:00
Chong Cai 5f132ae1f8 Clear Merkle files before measuring verity fs
PiperOrigin-RevId: 390467957
2021-08-12 15:02:32 -07:00
Kevin Krakauer 345eb4a666 fix typo
PiperOrigin-RevId: 390463819
2021-08-12 14:43:45 -07:00
gVisor bot 968792961d Automated rollback of changelist 390346783
PiperOrigin-RevId: 390405182
2021-08-12 10:35:15 -07:00
Andrei Vagin f06b1fe862 test/pipe: use futex() for sync with the signal hander
PiperOrigin-RevId: 390399815
2021-08-12 10:14:32 -07:00
gVisor bot 403c4b1a05 Internal change.
PiperOrigin-RevId: 390346783
2021-08-12 05:25:44 -07:00
gVisor bot 3416a3db77 Internal change.
PiperOrigin-RevId: 390318725
2021-08-12 01:40:34 -07:00
Nayana Bidari 96459f5598 Add support for TCP send buffer auto tuning.
Send buffer size in TCP indicates the amount of bytes available for the sender
to transmit. This change will allow TCP to update the send buffer size when
- TCP enters established state.
- ACK is received.

The auto tuning is disabled when the send buffer size is set with the
SO_SNDBUF option.

PiperOrigin-RevId: 390312274
2021-08-12 00:51:35 -07:00
Chong Cai 01cfe59528 Add verity stat benchmark test
PiperOrigin-RevId: 390284683
2021-08-11 21:17:10 -07:00
Ayush Ranjan 6d0b40b1d1 [op] Make PacketBuffer Clone() do a deeper copy.
Earlier PacketBuffer.Clone() would do a shallow top level copy of the packet
buffer - which involved sharing the *buffer.Buffer between packets. Reading
or writing to the buffer in one packet would impact the other.

This caused modifications in one packet to affect the other's pkt.Views() which
is not desired. Change the clone to do a deeper copy of the underlying buffer
list and buffer pointers. The payload buffers (which are immutable) are still
shared. This change makes the Clone() operation more expensive as we now need to
allocate the entire buffer list.

Added unit test to test integrity of packet data after cloning.

Reported-by: syzbot+7ffff9a82a227b8f2e31@syzkaller.appspotmail.com
Reported-by: syzbot+7d241de0d9072b2b6075@syzkaller.appspotmail.com
Reported-by: syzbot+212bc4d75802fa461521@syzkaller.appspotmail.com
PiperOrigin-RevId: 390277713
2021-08-11 20:18:19 -07:00
Chong Cai 4249ba8506 Do not clear merkle files when creating dentry
The dentry for each file/directory can be created/destroyed multiple
times during sandbox lifetime. We should not clear the Merkle file each
time a dentry is created.

PiperOrigin-RevId: 390277107
2021-08-11 20:11:19 -07:00
Chong Cai 5456fa6477 Popluate verity directory children names
We were relying on children adding its name to parent's dentry to
populate parent's children list. However, this may not work since the
parent dentry could be destroyed if its reference count drops to zero.
In that case, a new dentry will be created when enabling the parent and
it does not contain the children names info. Therefore we need to
populate the child names list again to avoid missing children in the
directory.

PiperOrigin-RevId: 390270227
2021-08-11 19:09:05 -07:00
Ghanan Gowripalan d51bc877f4 Run packet socket tests on Fuchsia
+ Do not check for CAP_NET_RAW on Fuchsia

  Fuchsia does not support capabilities the same way Linux does. Instead
  emulate the check for CAP_NET_RAW by checking if a packet socket may
  be created.

Bug: https://fxbug.dev/79016, https://fxbug.dev/81592
PiperOrigin-RevId: 390263666
2021-08-11 18:21:40 -07:00
Rahat Mahmood a50596874a Initial cgroupfs support for subcontainers
Allow creation and management of subcontainers through cgroupfs
directory syscalls. Also add a mechanism to specify a default root
container to start new jobs in.

This implements the filesystem support for subcontainers, but doesn't
implement hierarchical resource accounting or task migration.

PiperOrigin-RevId: 390254870
2021-08-11 17:21:37 -07:00
Adam Barth 09b453cec0 Fix FSSupportsMap check
Previously, this check always failed because we did not provide MAP_SHARED
or MAP_PRIVATE.

PiperOrigin-RevId: 390251086
2021-08-11 17:00:22 -07:00
Rahat Mahmood 8d84c5a8ee Wrap test queues in Queue object on creation.
PiperOrigin-RevId: 390245901
2021-08-11 16:35:52 -07:00
Adam Barth 23f8e84816 Fix LinkTest.OldnameDoesNotExist
Previous, this test was the same as OldnameIsEmpty. This CL makes the test check
what happens if the old name does not exist.

PiperOrigin-RevId: 390243070
2021-08-11 16:21:33 -07:00
Ayush Ranjan c2353e4055 [op] Fix //debian:debian.
Co-authored-by: Andrei Vagin <avagin@google.com>
PiperOrigin-RevId: 390232925
2021-08-11 15:28:51 -07:00
Andrei Vagin 14d6cb4436 platform/kvm: fix a race condition in vCPU.unlock()
Right now, it contains the code:

  origState := atomic.LoadUint32(&c.state)
  atomicbitops.AndUint32(&c.state, ^vCPUUser)

The problem here is that vCPU.bounce that is called from another thread can add
vCPUWaiter when origState has been read but vCPUUser isn't cleared yet. In this
case, vCPU.unlock doesn't notify other threads about changes and c.bounce will
be stuck in the futex_wait call.

PiperOrigin-RevId: 389697411
2021-08-09 12:32:31 -07:00
Ghanan Gowripalan 34ec00c5e7 Run raw IP socket syscall tests on Fuchsia
+ Do not check for CAP_NET_RAW on Fuchsia

  Fuchsia does not support capabilities the same way Linux does. Instead
  emulate the check for CAP_NET_RAW by checking if a raw IP sockets may
  be created.

PiperOrigin-RevId: 389663218
2021-08-09 10:20:21 -07:00
Zach Koopmans c07dc3828a [SMT] Refactor runsc mititgate
Refactor mitigate to use /sys/devices/system/cpu/smt/control instead
of individual CPU control files.

PiperOrigin-RevId: 389215975
2021-08-06 11:10:54 -07:00
Rahat Mahmood 569f605f43 Correctly handle interruptions in blocking msgqueue syscalls.
Reported-by: syzbot+63bde04529f701c76168@syzkaller.appspotmail.com
Reported-by: syzbot+69866b9a16ec29993e6a@syzkaller.appspotmail.com
PiperOrigin-RevId: 389084629
2021-08-05 20:16:54 -07:00
Rahat Mahmood 15853bdc88 Replace unsafe use of fork() in msgqueue tests.
Msgqueue tests were using fork() to run create a separate thread of
execution for passing messages back and forth over a queue. However,
the child process after a fork() may only use async-signal-safe
functions, which at a minimum exclude gtest asserts.

Instead, use threads.

PiperOrigin-RevId: 389073744
2021-08-05 18:47:30 -07:00
Rahat Mahmood a72efae969 Skip mmap test cases if underlying FS doesn't support maps.
For file-based mmap tests, the underlying file system may not support
mmaps depending on the sandbox configuration. This is case when
caching is disabled for goferfs.

PiperOrigin-RevId: 389052722
2021-08-05 16:39:49 -07:00
Michael Pratt 99325baf5d Bump gVisor build tags to go1.19
Go's dev.typeparams branch already claims to be Go 1.18, so our !go1.18 build
tags breaking testing gVisor with that branch.

Normally I would not want to bump the build tags this early, but I plan to
extend checklinkname to check the assumptions in these files and remove the
build tags ASAP. So we just go ahead and bump the tags until then to unblock
testing.

PiperOrigin-RevId: 389037239
2021-08-05 15:25:00 -07:00
Kevin Krakauer caf9403f62 Automated rollback of changelist 384508720
PiperOrigin-RevId: 389035388
2021-08-05 15:16:24 -07:00