Commit Graph

1989 Commits

Author SHA1 Message Date
Ghanan Gowripalan e63db5e7bb Discover default routers from Router Advertisements
This change allows the netstack to do NDP's Router Discovery as outlined by
RFC 4861 section 6.3.4.

Note, this change will not break existing uses of netstack as the default
configuration for the stack options is set in such a way that Router Discovery
will not be performed. See `stack.Options` and `stack.NDPConfigurations` for
more details.

This change introduces 2 options required to take advantage of Router Discovery,
all available under NDPConfigurations:
- HandleRAs: Whether or not NDP RAs are processes
- DiscoverDefaultRouters: Whether or not Router Discovery is performed

Another note: for a NIC to process Router Advertisements, it must not be a
router itself. Currently the netstack does not have per-interface routing
configuration; the routing/forwarding configuration is controlled stack-wide.
Therefore, if the stack is configured to enable forwarding/routing, no Router
Advertisements will be processed.

Tests: Unittest to make sure that Router Discovery and updates to the routing
table only occur if explicitly configured to do so. Unittest to make sure at
max stack.MaxDiscoveredDefaultRouters discovered default routers are remembered.
PiperOrigin-RevId: 278965143
2019-11-06 16:29:58 -08:00
Kevin Krakauer e1b21f3c8c Use PacketBuffers, rather than VectorisedViews, in netstack.
PacketBuffers are analogous to Linux's sk_buff. They hold all information about
a packet, headers, and payload. This is important for:

* iptables to access various headers of packets
* Preventing the clutter of passing different net and link headers along with
  VectorisedViews to packet handling functions.

This change only affects the incoming packet path, and a future change will
change the outgoing path.

Benchmark               Regular         PacketBufferPtr  PacketBufferConcrete
--------------------------------------------------------------------------------
BM_Recvmsg             400.715MB/s      373.676MB/s      396.276MB/s
BM_Sendmsg             361.832MB/s      333.003MB/s      335.571MB/s
BM_Recvfrom            453.336MB/s      393.321MB/s      381.650MB/s
BM_Sendto              378.052MB/s      372.134MB/s      341.342MB/s
BM_SendmsgTCP/0/1k     353.711MB/s      316.216MB/s      322.747MB/s
BM_SendmsgTCP/0/2k     600.681MB/s      588.776MB/s      565.050MB/s
BM_SendmsgTCP/0/4k     995.301MB/s      888.808MB/s      941.888MB/s
BM_SendmsgTCP/0/8k     1.517GB/s        1.274GB/s        1.345GB/s
BM_SendmsgTCP/0/16k    1.872GB/s        1.586GB/s        1.698GB/s
BM_SendmsgTCP/0/32k    1.017GB/s        1.020GB/s        1.133GB/s
BM_SendmsgTCP/0/64k    475.626MB/s      584.587MB/s      627.027MB/s
BM_SendmsgTCP/0/128k   416.371MB/s      503.434MB/s      409.850MB/s
BM_SendmsgTCP/0/256k   323.449MB/s      449.599MB/s      388.852MB/s
BM_SendmsgTCP/0/512k   243.992MB/s      267.676MB/s      314.474MB/s
BM_SendmsgTCP/0/1M     95.138MB/s       95.874MB/s       95.417MB/s
BM_SendmsgTCP/0/2M     96.261MB/s       94.977MB/s       96.005MB/s
BM_SendmsgTCP/0/4M     96.512MB/s       95.978MB/s       95.370MB/s
BM_SendmsgTCP/0/8M     95.603MB/s       95.541MB/s       94.935MB/s
BM_SendmsgTCP/0/16M    94.598MB/s       94.696MB/s       94.521MB/s
BM_SendmsgTCP/0/32M    94.006MB/s       94.671MB/s       94.768MB/s
BM_SendmsgTCP/0/64M    94.133MB/s       94.333MB/s       94.746MB/s
BM_SendmsgTCP/0/128M   93.615MB/s       93.497MB/s       93.573MB/s
BM_SendmsgTCP/0/256M   93.241MB/s       95.100MB/s       93.272MB/s
BM_SendmsgTCP/1/1k     303.644MB/s      316.074MB/s      308.430MB/s
BM_SendmsgTCP/1/2k     537.093MB/s      584.962MB/s      529.020MB/s
BM_SendmsgTCP/1/4k     882.362MB/s      939.087MB/s      892.285MB/s
BM_SendmsgTCP/1/8k     1.272GB/s        1.394GB/s        1.296GB/s
BM_SendmsgTCP/1/16k    1.802GB/s        2.019GB/s        1.830GB/s
BM_SendmsgTCP/1/32k    2.084GB/s        2.173GB/s        2.156GB/s
BM_SendmsgTCP/1/64k    2.515GB/s        2.463GB/s        2.473GB/s
BM_SendmsgTCP/1/128k   2.811GB/s        3.004GB/s        2.946GB/s
BM_SendmsgTCP/1/256k   3.008GB/s        3.159GB/s        3.171GB/s
BM_SendmsgTCP/1/512k   2.980GB/s        3.150GB/s        3.126GB/s
BM_SendmsgTCP/1/1M     2.165GB/s        2.233GB/s        2.163GB/s
BM_SendmsgTCP/1/2M     2.370GB/s        2.219GB/s        2.453GB/s
BM_SendmsgTCP/1/4M     2.005GB/s        2.091GB/s        2.214GB/s
BM_SendmsgTCP/1/8M     2.111GB/s        2.013GB/s        2.109GB/s
BM_SendmsgTCP/1/16M    1.902GB/s        1.868GB/s        1.897GB/s
BM_SendmsgTCP/1/32M    1.655GB/s        1.665GB/s        1.635GB/s
BM_SendmsgTCP/1/64M    1.575GB/s        1.547GB/s        1.575GB/s
BM_SendmsgTCP/1/128M   1.524GB/s        1.584GB/s        1.580GB/s
BM_SendmsgTCP/1/256M   1.579GB/s        1.607GB/s        1.593GB/s

PiperOrigin-RevId: 278940079
2019-11-06 14:25:59 -08:00
Ghanan Gowripalan d0d89ceedd Send a TCP RST in response to a TCP SYN-ACK on a listening endpoint
This change better follows what is outlined in RFC 793 section 3.4 figure 12
where a listening socket should not accept a SYN-ACK segment in response to a
(potentially) old SYN segment.

Tests: Test that checks the TCP RST segment sent in response to a TCP SYN-ACK
segment received on a listening TCP endpoint.
PiperOrigin-RevId: 278893114
2019-11-06 10:44:20 -08:00
Ghanan Gowripalan a824b48cea Validate incoming NDP Router Advertisements, as per RFC 4861 section 6.1.2
This change validates incoming NDP Router Advertisements as per RFC 4861 section
6.1.2. It also includes the skeleton to handle Router Advertiements that arrive
on some NIC.

Tests: Unittest to make sure only valid NDP Router Advertisements are received/
not dropped.
PiperOrigin-RevId: 278891972
2019-11-06 10:39:29 -08:00
Andrei Vagin 57f6dbc4be test/root: check that memory accouting works as expected
PiperOrigin-RevId: 278739427
2019-11-05 17:03:41 -08:00
Adin Scannell e904823833 Fix repository build scripts.
This fixes a number of issues with the repository build process:

 * Fix the overall structure of the repository.
 * Fix the debian package description.
 * Fix the broken version number for packages.
 * Update the digest algorithm used for signing the release.

I've validated that installation works from a separate staging bucket.

Updates #852

PiperOrigin-RevId: 278716914
2019-11-05 15:16:04 -08:00
Andrei Vagin 493334f8b5 kokoro: run KVM syscall tests
We don't know how stable they are, so let's start with warning.

PiperOrigin-RevId: 278484186
2019-11-04 16:00:34 -08:00
Nicolas Lacasse 1e21496e95 Bump rules_go to v0.20.2 and go toolchain to v1.13.4.
PiperOrigin-RevId: 278424814
2019-11-04 11:27:57 -08:00
Kevin Krakauer 4fdd69d681 Check that a file is a regular file with open(O_TRUNC).
It was possible to panic the sentry by opening a cache revalidating folder with
O_TRUNC|O_CREAT.

PiperOrigin-RevId: 278417533
2019-11-04 10:58:29 -08:00
Michael Pratt b23b36e701 Add NETLINK_KOBJECT_UEVENT socket support
NETLINK_KOBJECT_UEVENT sockets send udev-style messages for device events.
gVisor doesn't have any device events, so our sockets don't need to do anything
once created.

systemd's device manager needs to be able to create one of these sockets. It
also wants to install a BPF filter on the socket. Since we'll never send any
messages, the filter would never be invoked, thus we just fake it out.

Fixes #1117
Updates #1119

PiperOrigin-RevId: 278405893
2019-11-04 10:07:52 -08:00
Michael Pratt 3b4f5445d0 Update membarrier bug
Updates #267

PiperOrigin-RevId: 278402684
2019-11-04 09:55:30 -08:00
gVisor bot 802a3b3bd0 Merge pull request #1109 from xiaobo55x:fsgofer
PiperOrigin-RevId: 278032567
2019-11-01 17:37:07 -07:00
Michael Pratt 515fee5b6d Add SO_PASSCRED support to netlink sockets
Since we only supporting sending messages from the kernel, the peer is always
the kernel, simplifying handling.

There are currently no known users of SO_PASSCRED that would actually receive
messages from gVisor, but adding full support is barely more work than stubbing
out fake support.

Updates #1117
Fixes #1119

PiperOrigin-RevId: 277981465
2019-11-01 12:45:11 -07:00
Nicolas Lacasse 2a709a1b7b Add "manual" tag back to runtime tests.
PiperOrigin-RevId: 277971910
2019-11-01 11:53:47 -07:00
Nicolas Lacasse e70f28664a Allow the watchdog to detect when the sandbox is stuck during setup.
The watchdog currently can find stuck tasks, but has no way to tell if the
sandbox is stuck before the application starts executing.

This CL adds a startup timeout and action to the watchdog. If Start() is not
called before the given timeout (if non-zero), then the watchdog will take the
action.

PiperOrigin-RevId: 277970577
2019-11-01 11:49:31 -07:00
Jamie Liu 5694bd080e Don't log "p9.channel.service: flipcall connection shutdown".
This gets quite spammy, especially in tests.

PiperOrigin-RevId: 277970468
2019-11-01 11:45:02 -07:00
Andrei Vagin af6af2c341 tests: don't use ASSERT_THAT after fork
PiperOrigin-RevId: 277965624
2019-11-01 11:22:21 -07:00
Adin Scannell a99d3479a8 Add context to state.
PiperOrigin-RevId: 277840416
2019-10-31 18:03:24 -07:00
Ian Lewis 36837c4ad3 Add systemd-cgroup flag option.
Adds a systemd-cgroup flag option that prints an error letting the user know
that systemd cgroups are not supported and points them to the relevant issue.

Issue #193

PiperOrigin-RevId: 277837162
2019-10-31 17:39:06 -07:00
Adin Scannell fe2e0764ac Add LICENSE and AUTHORS to the go branch.
Also, construct the README directly so that edits can be made.

PiperOrigin-RevId: 277782095
2019-10-31 12:53:27 -07:00
Andrei Vagin f7dbddaf77 platform/kvm: calll sigtimedwait with zero timeout
sigtimedwait is used to check pending signals and
it should not block.

PiperOrigin-RevId: 277777269
2019-10-31 12:29:04 -07:00
Brad Burlage 7dcfcd53e4 Fix overloaded use of $RUNTIME.
Turns out we use $RUNTIME in scripts/common.sh to give a name to the runsc
runtime used by the tests.

PiperOrigin-RevId: 277764383
2019-10-31 11:28:18 -07:00
gVisor bot 0202be1ba5 Merge pull request #1058 from cmingxu:master
PiperOrigin-RevId: 277623766
2019-10-31 11:26:45 -07:00
Kevin Krakauer 3246040447 Deep copy dispatcher views.
When VectorisedViews were passed up the stack from packet_dispatchers, we were
passing a sub-slice of the dispatcher's views fields. The dispatchers then
immediately set those views to nil.

This wasn't caught before because every implementer copied the data in these
views before returning.

PiperOrigin-RevId: 277615351
2019-10-30 17:12:57 -07:00
Brad Burlage df125c9869 Add Kokoro config for new runtime tests
PiperOrigin-RevId: 277607217
2019-10-30 16:16:15 -07:00
lubinszARM ca933329fa support using KVM_MEM_READONLY for arm64 regions
On Arm platform, "setMemoryRegion" has extra permission checks.
In virt/kvm/arm/mmu.c: kvm_arch_prepare_memory_region()
      ....
      if (writable && !(vma->vm_flags & VM_WRITE)) {
             ret = -EPERM;
             break;
       }
        ....
So, for Arm platform, the "flags" for kvm_memory_region is required.
And on x86 platform, the "flags" can be always set as '0'.

Signed-off-by: Bin Lu <bin.lu@arm.com>
COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/810 from lubinszARM:pr_setregion 8c99b19cfb0c859c6630a1cfff951db65fcf87ac
PiperOrigin-RevId: 277602603
2019-10-30 15:53:31 -07:00
Fabricio Voznika ca90dad0e2 Fix container locking
Sandbox root dir was not being saved with the Container state,
so it would point to the wrong directory location when attempting
to lock the sandbox. This led to race conditions saving and
loading container state. Fixing it, led to multiple deadlocks.

I've moved the saving and locking logic to a separate struct and
moved the lock file inside the RootDir (instead of container
root dir), which allows the lock to be taken inside Destroy,
and removes the need to lock the sandbox.

PiperOrigin-RevId: 277599612
2019-10-30 15:39:04 -07:00
Andrei Vagin db37483cb6 Store endpoints inside multiPortEndpoint in a sorted order
It is required to guarantee the same order of endpoints after save/restore.

PiperOrigin-RevId: 277598665
2019-10-30 15:33:41 -07:00
Dean Deng 8bc7b8dba2 Clean up typos in test names.
PiperOrigin-RevId: 277572791
2019-10-30 13:31:12 -07:00
Haibo Xu 80d0db274e Enable runsc/fsgofer support on arm64.
newfstatat() syscall is not supported on arm64, so we resort
to use the fstatat() syscall.

Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: I9e89d46c5ec9ae07db201c9da5b6dda9bfd2eaf0
2019-10-30 05:21:36 +00:00
Ian Gudger dc21c5ca16 Add Close and Wait methods to stack.
Link endpoints still don't have a unified way to be requested to stop.

Updates #837

PiperOrigin-RevId: 277398952
2019-10-29 17:22:32 -07:00
Ian Gudger a2c51efe36 Add endpoint tracking to the stack.
In the future this will replace DanglingEndpoints. DanglingEndpoints must be
kept for now due to issues with save/restore.

This is arguably a cleaner design and allows the stack to know which transport
endpoints might still be using its link endpoints.

Updates #837

PiperOrigin-RevId: 277386633
2019-10-29 16:14:51 -07:00
Dean Deng d7f5e823e2 Fix grammar in comment.
Missing "for".

PiperOrigin-RevId: 277358513
2019-10-29 14:05:04 -07:00
Dean Deng 38330e9377 Update symlink traversal limit when resolving interpreter path.
When execveat is called on an interpreter script, the symlink count for
resolving the script path should be separate from the count for resolving the
the corresponding interpreter. An ELOOP error should not occur if we do not hit
the symlink limit along any individual path, even if the total number of
symlinks encountered exceeds the limit.

Closes #574

PiperOrigin-RevId: 277358474
2019-10-29 13:59:28 -07:00
Michael Pratt c0b8fd4b6a Update build tags to allow Go 1.14
Currently there are no ABI changes. We should check again closer to release.

PiperOrigin-RevId: 277349744
2019-10-29 13:18:16 -07:00
Dean Deng 2e00771d5a Refactor logic for loadExecutable.
Separate the handling of filenames and *fs.File objects in a more explicit way
for the sake of clarity.

PiperOrigin-RevId: 277344203
2019-10-29 12:51:29 -07:00
Bhasker Hariharan 392c561495 Fix PollWithFullBufferBlocks.
Set the snd/rcv buffer sizes so that the test is deterministic and runs in a
reasonable amount of time. It also ensures that we disable any auto-tuning of
the send/receive buffer which may happen.

PiperOrigin-RevId: 277337232
2019-10-29 12:17:06 -07:00
Ian Gudger 7d80e85835 Allow waiting for Endpoint worker goroutines to finish.
Updates #837

PiperOrigin-RevId: 277325162
2019-10-29 11:32:48 -07:00
gVisor bot 8b04e2dd8b Merge pull request #1087 from xiaobo55x:fstat_Nlink
PiperOrigin-RevId: 277324979
2019-10-29 11:27:57 -07:00
Ghanan Gowripalan 41e2df1bde Support iterating an NDP options buffer.
This change helps support iterating over an NDP options buffer so that
implementations can handle all the NDP options present in an NDP packet.

Note, this change does not yet actually handle these options, it just provides
the tools to do so (in preparation for NDP's Prefix, Parameter, and a complete
implementation of Neighbor Discovery).

Tests: Unittests to make sure we can iterate over a valid NDP options buffer
that may contain multiple options. Also tests to check an iterator before
using it to see if the NDP options buffer is malformed.
PiperOrigin-RevId: 277312487
2019-10-29 10:30:21 -07:00
Dean Deng 29273b0384 Disallow execveat on interpreter scripts with fd opened with O_CLOEXEC.
When an interpreter script is opened with O_CLOEXEC and the resulting fd is
passed into execveat, an ENOENT error should occur (the script would otherwise
be inaccessible to the interpreter). This matches the actual behavior of
Linux's execveat.

PiperOrigin-RevId: 277306680
2019-10-29 10:04:39 -07:00
Fabricio Voznika dbeaf9d4db Deflake TestCheckpointRestore
PiperOrigin-RevId: 277189064
2019-10-28 18:50:04 -07:00
Ghanan Gowripalan 0864549ecc Use the user supplied TCP MSS when creating a new active socket
This change supports using a user supplied TCP MSS for new active TCP
connections. Note, the user supplied MSS must be less than or equal to the
maximum possible MSS for a TCP connection's route. If it is greater than the
maximum possible MSS, the maximum possible MSS will be used as the connection's
MSS instead.

This change does not use this user supplied MSS for connections accepted from
listening sockets - that will come in a later change.

Test: Test that outgoing TCP SYN segments contain a TCP MSS option with the user
supplied MSS if it is not greater than the maximum possible MSS for the route.
PiperOrigin-RevId: 277185125
2019-10-28 18:20:36 -07:00
Michael Pratt 198f1cddb8 Update comment
FDTable.GetFile doesn't exist.

PiperOrigin-RevId: 277089842
2019-10-28 10:20:23 -07:00
Haibo Xu dec831b493 Cast the Stat_t.Nlink to uint64 on arm64.
Since the syscall.Stat_t.Nlink is defined as different types on
amd64 and arm64(uint64 and uint32 respectively), we need to cast
them to a unified uint64 type in gVisor code.

Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: I7542b99b195c708f3fc49b1cbe6adebdd2f6e96b
2019-10-28 05:56:03 +00:00
Dean Deng 1c480abc39 Aggregate arguments for loading executables into a single struct.
This change simplifies the function signatures of functions related to loading
executables, such as LoadTaskImage, Load, loadBinary.

PiperOrigin-RevId: 276821187
2019-10-25 22:44:19 -07:00
Ghanan Gowripalan 5a421058a0 Validate the checksum for incoming ICMPv6 packets
This change validates the ICMPv6 checksum field before further processing an
ICMPv6 packet.

Tests: Unittests to make sure that only ICMPv6 packets with a valid checksum
are accepted/processed. Existing tests using checker.ICMPv6 now also check the
ICMPv6 checksum field.
PiperOrigin-RevId: 276779148
2019-10-25 16:06:55 -07:00
Ian Gudger 8f029b3f82 Convert DelayOption to the newer/faster SockOpt int type.
DelayOption is set on all new endpoints in gVisor.

PiperOrigin-RevId: 276746791
2019-10-25 13:15:34 -07:00
Haibo e0c84f284c test/syscall: Remove duplicated gtest/gtest.h.
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: I05a7ec69b98b88931ba4a8adb3e8a7b822006001
COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/1023 from xiaobo55x:syscall_test d44a8b1f827ed4081997af96cd58ba7449e0a9e1
PiperOrigin-RevId: 276740442
2019-10-25 12:40:36 -07:00
Andrei Vagin fd598912be platform/ptrace: use tgkill instead of kill
The syscall filters don't allow kill, just tgkill.

PiperOrigin-RevId: 276718421
2019-10-25 11:19:20 -07:00