Commit Graph

625 Commits

Author SHA1 Message Date
Ryan Heacock cbc0a92276 Correct todos referencing IPV6_RECVTCLASS
Bug 68320120 was revived because TODOs referenced the IP_RECVTOS bug instead
of the IPV6_RECVTCLASS bug.

PiperOrigin-RevId: 290820178
2020-01-21 14:22:06 -08:00
Kevin Krakauer 9f736ac6a7 More little fixes. 2020-01-21 13:42:43 -08:00
Kevin Krakauer 47bc7550c0 Fixing stuff 2020-01-21 13:37:25 -08:00
Kevin Krakauer 62357a0afb Merge branch 'master' into iptables-write-filter-proto 2020-01-21 13:16:25 -08:00
Dean Deng 2ba6198851 Add syscalls for lgetxattr, fgetxattr, lsetxattr, and fsetxattr.
Note that these simply will use the same logic as getxattr and setxattr, which
is not yet implemented for most filesystems.

PiperOrigin-RevId: 290800960
2020-01-21 12:43:18 -08:00
gVisor bot 5f82f092e7 Merge pull request #1558 from kevinGC:iptables-write-input-drop
PiperOrigin-RevId: 290793754
2020-01-21 12:08:52 -08:00
gVisor bot 7e155a133b Merge pull request #1546 from lubinszARM:pr_syscall_test_proc
PiperOrigin-RevId: 290789087
2020-01-21 11:42:41 -08:00
Andrei Vagin 9073521098 Convert EventMask to uint64
It is used for signalfd where the maximum signal is 64.

PiperOrigin-RevId: 290331008
2020-01-17 13:32:51 -08:00
gVisor bot c98e1bc23f Merge pull request #1459 from lubinszARM:pr_save_util
PiperOrigin-RevId: 290273702
2020-01-17 09:08:47 -08:00
gVisor bot 989b611f5a Merge pull request #1541 from nybidari:iptables
PiperOrigin-RevId: 290273561
2020-01-17 08:38:25 -08:00
Haibo Xu 82ae857877 Enable build of test/syscall tests on arm64.
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: I277d6c708bbf5c3edd7c3568941cfd01dc122e17
2020-01-17 07:39:57 +00:00
Dean Deng c50efc8c70 Disable xattr tests.
These can remain disabled until we actually support extended attributes.

The following modifications were also made:
1. Disable save/restore on tests that change file permissions. Restore will not
work properly for these tests, since it will try to open the file with
read-write after it has been read- or write-only.
2. Change user.abc to user.test.

PiperOrigin-RevId: 290123941
2020-01-16 13:11:47 -08:00
Bhasker Hariharan a611fdaee3 Changes TCP packet dispatch to use a pool of goroutines.
All inbound segments for connections in ESTABLISHED state are delivered to the
endpoint's queue but for every segment delivered we also queue the endpoint for
processing to a selected processor. This ensures that when there are a large
number of connections in ESTABLISHED state the inbound packets are all handled
by a small number of goroutines and significantly reduces the amount of work the
goscheduler has to perform.

We let connections in other states follow the current path where the
endpoint's goroutine directly handles the segments.

Updates #231

PiperOrigin-RevId: 289728325
2020-01-14 14:15:50 -08:00
Tamir Duberstein 50625cee59 Implement {g,s}etsockopt(IP_RECVTOS) for UDP sockets
PiperOrigin-RevId: 289718534
2020-01-14 13:33:23 -08:00
Kevin Krakauer d51eaa59c0 Merge branch 'iptables-write-input-drop' into iptables-write-filter-proto 2020-01-13 16:06:29 -08:00
Tamir Duberstein debd213da6 Allow dual stack sockets to operate on AF_INET
Fixes #1490
Fixes #1495

PiperOrigin-RevId: 289523250
2020-01-13 14:47:22 -08:00
Kevin Krakauer 31e49f4b19 Merge branch 'master' into iptables-write-input-drop 2020-01-13 12:22:15 -08:00
Andrei Vagin f54b9c0ee6 tests: fix errors detected by asan.
PiperOrigin-RevId: 289467083
2020-01-13 10:16:07 -08:00
Nayana Bidari 98327a94cc Add test for iptables TCP rule
Added tests for tcp protocol with input and output rules including options sport and dport
Increased timeout in iptables_test as TCP tests were timing out with existing value.
2020-01-13 09:11:40 -08:00
Brad Burlage bf6429b944 Don't set RWF_HIPRI on InvalidOffset test.
This test fails on ubuntu 18.04 because preadv2 for some reason returns
EOPNOTSUPP instead of EINVAL. Instead of root-causing the failure, I'm dropping
the flag in the preadv2 call since it isn't under test in this scenario.

PiperOrigin-RevId: 289188358
2020-01-10 16:36:34 -08:00
Nayana Bidari 9aeb053bba Add tests for redirect port
Fix indentation and change function names.
2020-01-10 09:05:25 -08:00
Bin Lu ebd25099bf enable //test/syscalls:proc_test support on Arm64
Problems with different platform architectures have been solved.

Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-01-10 16:45:48 +08:00
Bhasker Hariharan 356d81146b Deflake a couple of TCP syscall tests when run under gotsan.
PiperOrigin-RevId: 289010316
2020-01-09 17:58:48 -08:00
Nayana Bidari 04abc9cf55 Add test for redirect port
Fix the indentation and print statements.
Moved the NAT redirect tests to new file.
Added negative test to check redirect rule on ports other than
redirected port.
2020-01-09 15:38:28 -08:00
Kevin Krakauer 89d11b4d96 Added a test that we don't pass yet 2020-01-09 13:41:52 -08:00
Nayana Bidari 6cc8e2d814 Add test to check iptables redirect port rule 2020-01-09 10:24:26 -08:00
Kevin Krakauer aeb3a4017b Working on filtering by protocol. 2020-01-08 22:10:35 -08:00
Ian Lewis fbb2c008e2 Return correct length with MSG_TRUNC for unix sockets.
This change calls a new Truncate method on the EndpointReader in RecvMsg for
both netlink and unix sockets.  This allows readers such as sockets to peek at
the length of data without actually reading it to a buffer.

Fixes #993 #1240

PiperOrigin-RevId: 288800167
2020-01-08 17:24:05 -08:00
Ting-Yu Wang b3ae8a62cf Fix slice bounds out of range panic in parsing socket control message.
Panic found by syzakller.

PiperOrigin-RevId: 288799046
2020-01-08 16:32:34 -08:00
Kevin Krakauer b2a881784c Built dead-simple traversal, but now getting depedency cycle error :'( 2020-01-08 14:48:47 -08:00
Tamir Duberstein d01240d871 Take addresses as const
PiperOrigin-RevId: 288767927
2020-01-08 13:54:19 -08:00
Kevin Krakauer 447f64c561 Added test for unconditional DROP on the filter INPUT chain 2020-01-08 12:48:17 -08:00
Kevin Krakauer 2f02e15e54 Newline 2020-01-08 11:17:15 -08:00
Kevin Krakauer 899309c4eb Revert filter_input change 2020-01-08 11:16:41 -08:00
Kevin Krakauer 1e1921e2ac Minor fixes to comments and logging 2020-01-08 11:15:46 -08:00
Kevin Krakauer 8cc1c35bbd Write simple ACCEPT rules to the filter table.
This gets us closer to passing the iptables tests and opens up iptables
so it can be worked on by multiple people.

A few restrictions are enforced for security (i.e. we don't want to let
users write a bunch of iptables rules and then just not enforce them):

- Only the filter table is writable.
- Only ACCEPT rules with no matching criteria can be added.
2020-01-08 10:08:14 -08:00
Andrei Vagin a53ac7307a fs/splice: don't report a partialResult error if there is no data loss
PiperOrigin-RevId: 288642552
2020-01-07 23:54:14 -08:00
Adin Scannell e77ad57423 Fix partial_bad_buffer write tests.
The write tests are fitted to Linux-specific behavior, but it is not
well-specified. Tweak the tests to allow for both acceptable outcomes.

PiperOrigin-RevId: 288606386
2020-01-07 17:26:42 -08:00
Kevin Krakauer ed60bc326b Fix readme formatting.
PiperOrigin-RevId: 288402480
2020-01-06 16:49:34 -08:00
Bin Lu 03e53745cc Add test/util/save_util_linux.cc:MaybeSave to support arm64
There is no syscall_create_module on Arm64.

Signed-off-by: Bin Lu <bin.lu@arm.com>
2019-12-30 10:09:48 +08:00
gVisor bot 87e4d03fdf Automated rollback of changelist 287029703
PiperOrigin-RevId: 287217899
2019-12-26 13:05:52 -08:00
Ryan Heacock e013c48c78 Enable IP_RECVTOS socket option for datagram sockets
Added the ability to get/set the IP_RECVTOS socket option on UDP endpoints. If
enabled, TOS from the incoming Network Header passed as ancillary data in the
ControlMessages.

Test:
* Added unit test to udp_test.go that tests getting/setting as well as
verifying that we receive expected TOS from incoming packet.
* Added a syscall test
PiperOrigin-RevId: 287029703
2019-12-24 08:49:39 -08:00
Andrei Vagin 29955a4797 futex: wake one waiter if futex_wake is called with a non-positive value
This change is needed to be compatible with the Linux kernel.

There is no glibc wrapper for the futex system call, so it is easy to
make a mistake and call syscall(__NR_futex, FUTEX_WAKE, addr) without
the fourth argument. This works on Linux, because it wakes one waiter
even if val is nonpositive.

PiperOrigin-RevId: 286494396
2019-12-19 17:26:44 -08:00
Dean Deng 7419e0e5d7 Parameterize mmap tests.
This test suite has existed for quite a while and has become kind of messy.
Various tests can be joined together by parameterizing.

PiperOrigin-RevId: 286482240
2019-12-19 16:07:04 -08:00
Andrei Vagin 57ce26c0b4 net/tcp: allow to call listen without bind
When listen(2) is called on an unbound socket, the socket is
automatically bound to a random free port with the local address
set to INADDR_ANY.

PiperOrigin-RevId: 286305906
2019-12-18 18:24:17 -08:00
Jay Zhuang 18d6e59b45 Switch to netinet/tcp.h and poll.h to for better platform portability.
PiperOrigin-RevId: 286249699
2019-12-18 13:58:38 -08:00
Jay Zhuang 65f53c5833 Put GetSocketPairs() in unnamed namespace
This avoids conflicting definitions of GetSocketPairs() in outer namespace when
multiple such cc files are complied for one binary.

PiperOrigin-RevId: 286243045
2019-12-18 12:50:04 -08:00
Kevin Krakauer 64d00cc63d Internal change.
PiperOrigin-RevId: 286083614
2019-12-17 16:21:48 -08:00
gVisor bot 3f4d8fefb4 Internal change.
PiperOrigin-RevId: 286003946
2019-12-17 10:10:06 -08:00
gVisor bot 67000b929b Explicitly export files needed by other packages
PiperOrigin-RevId: 285968611
2019-12-17 06:33:08 -08:00
Dean Deng e6f4124afd Implement checks for get/setxattr at the syscall layer.
Add checks for input arguments, file type, permissions, etc. that match
the Linux implementation. A call to get/setxattr that passes all the
checks will still currently return EOPNOTSUPP. Actual support will be
added in following commits.

Only allow user.* extended attributes for the time being.

PiperOrigin-RevId: 285835159
2019-12-16 13:20:07 -08:00
Kevin Krakauer be2754a4b9 Add iptables testing framework.
It would be preferrable to test iptables via syscall tests, but there are some
problems with that approach:

* We're limited to loopback-only, as syscall tests involve only a single
  container. Other link interfaces (e.g. fdbased) should be tested.
* We'd have to shell out to call iptables anyways, as the iptables syscall
  interface itself is too large and complex to work with alone.
* Running the Linux/native version of the syscall test will require root, which
  is a pain to configure, is inherently unsafe, and could leave host iptables
  misconfigured.

Using the go_test target allows there to be no new test runner.

PiperOrigin-RevId: 285274275
2019-12-12 14:42:11 -08:00
Andrei Vagin 378d6c1f36 unix: allow to bind unix sockets only to AF_UNIX addresses
Reported-by: syzbot+2c0bcfd87fb4e8b7b009@syzkaller.appspotmail.com
PiperOrigin-RevId: 285228312
2019-12-12 11:08:56 -08:00
Bhasker Hariharan 6fc9f0aefd Add support for TCP_USER_TIMEOUT option.
The implementation follows the linux behavior where specifying
a TCP_USER_TIMEOUT will cause the resend timer to honor the
user specified timeout rather than the default rto based timeout.

Further it alters when connections are timedout due to keepalive
failures. It does not alter the behavior of when keepalives are
sent. This is as per the linux behavior.

PiperOrigin-RevId: 285099795
2019-12-11 17:52:53 -08:00
Dean Deng 1601e78a52 Add syscall tests for getxattr and setxattr.
Support for getxattr and setxattr are in subsequent commits.

PiperOrigin-RevId: 285088817
2019-12-11 16:41:17 -08:00
Dean Deng 769e1cdcbe Re-enable execveat test that was causing files in /bin to be deleted.
Test now no longer deletes files incorrectly, due to a fix in fs utils
used by TempPath (github.com/google/gvisor/pull/1368).

Fixes #1366

PiperOrigin-RevId: 284814605
2019-12-10 11:42:03 -08:00
Dean Deng f47eaffd5c Do not consider symlinks as directories in fs utils.
IsDirectory() is used in RecursivelyDelete(), which should not follow symlinks.
The only other use (syscalls/linux/rename.cc) is not affected by this change.

Updates #1366.

PiperOrigin-RevId: 284803968
2019-12-10 11:09:44 -08:00
Dean Deng aadbf322c6 Disable execveat test that is causing files in /bin to be deleted.
Disable until gvisor.dev/issue/1366 is resolved.

Updates #1366

PiperOrigin-RevId: 284786895
2019-12-10 09:41:07 -08:00
Dean Deng 4a19ebd431 Add hostinet tests for sendmsg and recvmsg with TOS/TCLASS.
PiperOrigin-RevId: 284786069
2019-12-10 09:34:38 -08:00
Ian Gudger 98aafb1334 Add test for SO_BINDTODEVICE state bug.
This was accidentally dropped from the change which fixed the bug.

Updates #1217

PiperOrigin-RevId: 284689362
2019-12-09 20:09:23 -08:00
Ian Gudger 18af75db9d Add UDP SO_REUSEADDR support to the port manager.
Next steps include adding support to the transport demuxer and the UDP endpoint.

PiperOrigin-RevId: 284652151
2019-12-09 15:53:00 -08:00
Jay Zhuang 17867c88f7 Include <netinet/tcp.h> for TCP enums in proc_net tests
These are currently duplicated in ip_socket_test_util, so tests including
both netinet/tcp.h and ip_socket_test_util won't compile.

PiperOrigin-RevId: 284623958
2019-12-09 13:37:32 -08:00
Bhasker Hariharan cb5f9b8f86 Mark test as non flaky.
PiperOrigin-RevId: 284606133
2019-12-09 12:04:51 -08:00
Michael Pratt 498595d543 Add tests for rseq(2)
Add a decent set of syscall tests for rseq(2). These are a bit awkward because
of issues with library integration. libc may register rseq on thread start
(including before main on the initial thread), precluding much testing. Thus we
run tests in a libc-free subprocess.

Support for rseq(2) in gVisor will come in a later commit.

PiperOrigin-RevId: 284595994
2019-12-09 11:22:31 -08:00
Dean Deng b0066217ec Add hostinet tests for UDP sockets.
We need to skip a subset of the tests, because of features that hostinet does
not currently support.

Fixes #1209

PiperOrigin-RevId: 284235911
2019-12-06 12:14:23 -08:00
Ian Gudger 13f0f6069a Implement F_GETOWN_EX and F_SETOWN_EX.
Some versions of glibc will convert F_GETOWN fcntl(2) calls into F_GETOWN_EX in
some cases.

PiperOrigin-RevId: 284089373
2019-12-05 17:28:52 -08:00
Bhasker Hariharan f053c52812 Reduce flakiness under gotsan runs.
TcpPortReuseMultiThread creates lots of connections which result in
a lot of goroutines in the sentry. This can cause gotsan runs to
take really long and timeout. Increasing listen backlog and
reducing number of connections should help the connections complete
faster as well as reduce the number of goroutines that gotsan needs
to track.

PiperOrigin-RevId: 284046018
2019-12-05 13:57:08 -08:00
Zach Koopmans 0a32c02357 Create correct file for /proc/[pid]/task/[tid]/io
PiperOrigin-RevId: 284038840
2019-12-05 13:24:05 -08:00
gVisor bot 05758f34b2 Explicitly export files needed by other packages
PiperOrigin-RevId: 283955946
2019-12-05 05:45:09 -08:00
Dean Deng 6ae64d7935 Allow syscall tests to run with hostinet.
Fixes #1207

PiperOrigin-RevId: 283914438
2019-12-04 23:45:49 -08:00
Dean Deng 80b7ba0c97 Clean up readv_socket test suite.
Get rid of the SocketTest class, which is only extended by ReadvSocketTest.
Also, get rid of TCP sockets (which were unused anyway) from readv_socket.cc.
This is a very old test suite that isn't the right place for TCP loopback
tests.

PiperOrigin-RevId: 283672772
2019-12-03 19:42:20 -08:00
Fabricio Voznika bb641c5403 Point TODO to gvisor.dev
PiperOrigin-RevId: 283657725
2019-12-03 17:33:50 -08:00
Andrei Vagin cf7f27c167 net/udp: return a local route address as the bound-to address
If the socket is bound to ANY and connected to a loopback address,
getsockname() has to return the loopback address. Without this fix,
getsockname() returns ANY.

PiperOrigin-RevId: 283647781
2019-12-03 16:32:13 -08:00
Bhasker Hariharan 27e2c4ddca Fix panic due to early transition to Closed.
The code in rcv.consumeSegment incorrectly transitions to
CLOSED state from LAST-ACK before the final ACK for the FIN.

Further if receiving a segment changes a socket to a closed state
then we should not invoke the sender as the socket is now closed
and sending any segments is incorrect.

PiperOrigin-RevId: 283625300
2019-12-03 14:41:55 -08:00
Andrei Vagin 43643752f0 strace: don't create a slice with a negative value
PiperOrigin-RevId: 283613824
2019-12-03 13:49:38 -08:00
Michael Pratt d7cc2480cb Add RunfilesPath to test_util
A few tests have their own ad-hoc implementations. Add a single common one.

PiperOrigin-RevId: 283601666
2019-12-03 12:47:03 -08:00
Andrei Vagin b41277049c test/syscal: Don't skip ClockGettime.CputimeId
We skipped it due to the issue in the golang scheduler
which has been fixed in go1.13.

PiperOrigin-RevId: 283432226
2019-12-02 15:37:17 -08:00
Jay Zhuang 1518f7fd38 Fix typo, s/Convertable/Convertible/g
PiperOrigin-RevId: 283345791
2019-12-02 08:33:43 -08:00
Jay Zhuang aa70523da2 Port tests in udp_socket.cc to Fuchsia
Separate out a test in udp_socket.cc that depends on <linux/errqueue.h> so the
rest of the tests can run on Fuchsia.

PiperOrigin-RevId: 283322633
2019-12-02 05:38:30 -08:00
Michael Pratt 58afb4be69 Add floating point exception tests
PiperOrigin-RevId: 282828273
2019-11-27 13:49:12 -08:00
Ian Lewis 20279c305e Allow open(O_TRUNC) and (f)truncate for proc files.
This allows writable proc and devices files to be opened with O_CREAT|O_TRUNC.
This is encountered most frequently when interacting with proc or devices files
via the command line.
e.g. $ echo 8192 1048576 4194304 > /proc/sys/net/ipv4/tcp_rmem

Also adds a test to test the behavior of open(O_TRUNC), truncate, and ftruncate
on named pipes.

Fixes #1116

PiperOrigin-RevId: 282677425
2019-11-26 18:21:09 -08:00
Andrei Vagin 4e27ba372e tests: include sys/socket.h before linux/if_arp.h
This is how it has to be accoding to the man page.

PiperOrigin-RevId: 281998068
2019-11-22 10:57:11 -08:00
Adin Scannell c0f89eba6e Import and structure cleanup.
PiperOrigin-RevId: 281795269
2019-11-21 11:41:30 -08:00
Ting-Yu Wang af323eb7c1 Fix return codes for {get,set}sockopt for some nullptr cases.
Updates #1092

PiperOrigin-RevId: 280547239
2019-11-14 17:04:34 -08:00
Kevin Krakauer 339536de5e Check that a file is a regular file with open(O_TRUNC).
It was possible to panic the sentry by opening a cache revalidating folder with
O_TRUNC|O_CREAT.

Avoids breaking php tests.

PiperOrigin-RevId: 280533213
2019-11-14 16:08:34 -08:00
Kevin Krakauer 1e1f5ce082 Allow all runtime tests for a language to be run via a single command.
This was intended behavior per the README, but running tests without the --test
flag caused an error. Users can now omit the --test flag to run every test for a
runtime.

PiperOrigin-RevId: 280522025
2019-11-14 15:06:04 -08:00
Andrei Vagin 1e55eb3800 test/syscalls/proc: check an return code of waitid
PiperOrigin-RevId: 280295208
2019-11-13 15:48:12 -08:00
Jay Zhuang 683e8798ab Extract linux-specific test setup to separate file
PiperOrigin-RevId: 280264564
2019-11-13 13:21:50 -08:00
Ian Gudger 2c6c9af904 Add UDP SO_REUSEADDR/SO_REUSEPORT conversion tests.
Add additional tests for UDP SO_REUSEADDR and SO_REUSEPORT interaction.

If all existing all currently bound sockets as well as the current binding
socket have SO_REUSEADDR, or if all existing all currently bound sockets as
well as the current binding socket have SO_REUSEPORT, binding a currently bound
address is allowed. This seems odd since it means that the
SO_REUSEADDR/SO_REUSEPORT behavior can change with the binding of additional
sockets.

PiperOrigin-RevId: 280116163
2019-11-12 20:39:04 -08:00
Ian Gudger 57a2a5ea33 Add tests for SO_REUSEADDR and SO_REUSEPORT.
* Basic tests for the SO_REUSEADDR and SO_REUSEPORT options.
* SO_REUSEADDR functional tests for TCP and UDP.
* SO_REUSEADDR and SO_REUSEPORT interaction tests for UDP.
* Stubbed support for UDP getsockopt(SO_REUSEADDR).

PiperOrigin-RevId: 280049265
2019-11-12 14:04:14 -08:00
Ian Gudger b82bd24f94 Update ephemeral port reservation tests.
The existing tests which are disabled on gVisor are failing because we default
to SO_REUSEADDR being enabled for TCP sockets. Update the test comments.

Also add new tests for enabled SO_REUSEADDR.

PiperOrigin-RevId: 279862275
2019-11-11 18:35:48 -08:00
Bhasker Hariharan 2b0e4dc6aa Remove obsolete TODO. This is now fixed.
PiperOrigin-RevId: 279835100
2019-11-11 15:51:10 -08:00
gVisor bot 7730716800 Make `connect` on socket returned by `accept` correctly error out with EISCONN
PiperOrigin-RevId: 279814493
2019-11-11 14:15:06 -08:00
Bhasker Hariharan 66ebb6575f Add support for TIME_WAIT timeout.
This change adds explicit support for honoring the 2MSL timeout
for sockets in TIME_WAIT state. It also adds support for the
TCP_LINGER2 option that allows modification of the FIN_WAIT2
state timeout duration for a given socket.

It also adds an option to modify the Stack wide TIME_WAIT timeout
but this is only for testing. On Linux this is fixed at 60s.

Further, we also now correctly process RST's in CLOSE_WAIT and
close the socket similar to linux without moving it to error
state.

We also now handle SYN in ESTABLISHED state as per
RFC5961#section-4.1. Earlier we would just drop these SYNs.
Which can result in some tests that pass on linux to fail on
gVisor.

Netstack now honors TIME_WAIT correctly as well as handles the
following cases correctly.

- TCP RSTs in TIME_WAIT are ignored.
- A duplicate TCP FIN during TIME_WAIT extends the TIME_WAIT
  and a dup ACK is sent in response to the FIN as the dup FIN
  indicates potential loss of the original final ACK.
- An out of order segment during TIME_WAIT generates a dup ACK.
- A new SYN w/ a sequence number > the highest sequence number
  in the previous connection closes the TIME_WAIT early and
  opens a new connection.

Further to make the SYN case work correctly the ISN (Initial
Sequence Number) generation for Netstack has been updated to
be as per RFC. Its not a pure random number anymore and follows
the recommendation in https://tools.ietf.org/html/rfc6528#page-3.

The current hash used is not a cryptographically secure hash
function. A separate change will update the hash function used
to Siphash similar to what is used in Linux.

PiperOrigin-RevId: 279106406
2019-11-07 09:46:55 -08:00
Bhasker Hariharan 2326224a96 Fix yet another data race.
Fixes #1140

PiperOrigin-RevId: 279020846
2019-11-06 23:52:21 -08:00
Bhasker Hariharan 3552691137 Fix data race in syscall_test_runner.go
Fixes #1140

PiperOrigin-RevId: 279012793
2019-11-06 22:30:06 -08:00
Kevin Krakauer e1b21f3c8c Use PacketBuffers, rather than VectorisedViews, in netstack.
PacketBuffers are analogous to Linux's sk_buff. They hold all information about
a packet, headers, and payload. This is important for:

* iptables to access various headers of packets
* Preventing the clutter of passing different net and link headers along with
  VectorisedViews to packet handling functions.

This change only affects the incoming packet path, and a future change will
change the outgoing path.

Benchmark               Regular         PacketBufferPtr  PacketBufferConcrete
--------------------------------------------------------------------------------
BM_Recvmsg             400.715MB/s      373.676MB/s      396.276MB/s
BM_Sendmsg             361.832MB/s      333.003MB/s      335.571MB/s
BM_Recvfrom            453.336MB/s      393.321MB/s      381.650MB/s
BM_Sendto              378.052MB/s      372.134MB/s      341.342MB/s
BM_SendmsgTCP/0/1k     353.711MB/s      316.216MB/s      322.747MB/s
BM_SendmsgTCP/0/2k     600.681MB/s      588.776MB/s      565.050MB/s
BM_SendmsgTCP/0/4k     995.301MB/s      888.808MB/s      941.888MB/s
BM_SendmsgTCP/0/8k     1.517GB/s        1.274GB/s        1.345GB/s
BM_SendmsgTCP/0/16k    1.872GB/s        1.586GB/s        1.698GB/s
BM_SendmsgTCP/0/32k    1.017GB/s        1.020GB/s        1.133GB/s
BM_SendmsgTCP/0/64k    475.626MB/s      584.587MB/s      627.027MB/s
BM_SendmsgTCP/0/128k   416.371MB/s      503.434MB/s      409.850MB/s
BM_SendmsgTCP/0/256k   323.449MB/s      449.599MB/s      388.852MB/s
BM_SendmsgTCP/0/512k   243.992MB/s      267.676MB/s      314.474MB/s
BM_SendmsgTCP/0/1M     95.138MB/s       95.874MB/s       95.417MB/s
BM_SendmsgTCP/0/2M     96.261MB/s       94.977MB/s       96.005MB/s
BM_SendmsgTCP/0/4M     96.512MB/s       95.978MB/s       95.370MB/s
BM_SendmsgTCP/0/8M     95.603MB/s       95.541MB/s       94.935MB/s
BM_SendmsgTCP/0/16M    94.598MB/s       94.696MB/s       94.521MB/s
BM_SendmsgTCP/0/32M    94.006MB/s       94.671MB/s       94.768MB/s
BM_SendmsgTCP/0/64M    94.133MB/s       94.333MB/s       94.746MB/s
BM_SendmsgTCP/0/128M   93.615MB/s       93.497MB/s       93.573MB/s
BM_SendmsgTCP/0/256M   93.241MB/s       95.100MB/s       93.272MB/s
BM_SendmsgTCP/1/1k     303.644MB/s      316.074MB/s      308.430MB/s
BM_SendmsgTCP/1/2k     537.093MB/s      584.962MB/s      529.020MB/s
BM_SendmsgTCP/1/4k     882.362MB/s      939.087MB/s      892.285MB/s
BM_SendmsgTCP/1/8k     1.272GB/s        1.394GB/s        1.296GB/s
BM_SendmsgTCP/1/16k    1.802GB/s        2.019GB/s        1.830GB/s
BM_SendmsgTCP/1/32k    2.084GB/s        2.173GB/s        2.156GB/s
BM_SendmsgTCP/1/64k    2.515GB/s        2.463GB/s        2.473GB/s
BM_SendmsgTCP/1/128k   2.811GB/s        3.004GB/s        2.946GB/s
BM_SendmsgTCP/1/256k   3.008GB/s        3.159GB/s        3.171GB/s
BM_SendmsgTCP/1/512k   2.980GB/s        3.150GB/s        3.126GB/s
BM_SendmsgTCP/1/1M     2.165GB/s        2.233GB/s        2.163GB/s
BM_SendmsgTCP/1/2M     2.370GB/s        2.219GB/s        2.453GB/s
BM_SendmsgTCP/1/4M     2.005GB/s        2.091GB/s        2.214GB/s
BM_SendmsgTCP/1/8M     2.111GB/s        2.013GB/s        2.109GB/s
BM_SendmsgTCP/1/16M    1.902GB/s        1.868GB/s        1.897GB/s
BM_SendmsgTCP/1/32M    1.655GB/s        1.665GB/s        1.635GB/s
BM_SendmsgTCP/1/64M    1.575GB/s        1.547GB/s        1.575GB/s
BM_SendmsgTCP/1/128M   1.524GB/s        1.584GB/s        1.580GB/s
BM_SendmsgTCP/1/256M   1.579GB/s        1.607GB/s        1.593GB/s

PiperOrigin-RevId: 278940079
2019-11-06 14:25:59 -08:00
Andrei Vagin 57f6dbc4be test/root: check that memory accouting works as expected
PiperOrigin-RevId: 278739427
2019-11-05 17:03:41 -08:00
Andrei Vagin 493334f8b5 kokoro: run KVM syscall tests
We don't know how stable they are, so let's start with warning.

PiperOrigin-RevId: 278484186
2019-11-04 16:00:34 -08:00
Michael Pratt b23b36e701 Add NETLINK_KOBJECT_UEVENT socket support
NETLINK_KOBJECT_UEVENT sockets send udev-style messages for device events.
gVisor doesn't have any device events, so our sockets don't need to do anything
once created.

systemd's device manager needs to be able to create one of these sockets. It
also wants to install a BPF filter on the socket. Since we'll never send any
messages, the filter would never be invoked, thus we just fake it out.

Fixes #1117
Updates #1119

PiperOrigin-RevId: 278405893
2019-11-04 10:07:52 -08:00
Michael Pratt 515fee5b6d Add SO_PASSCRED support to netlink sockets
Since we only supporting sending messages from the kernel, the peer is always
the kernel, simplifying handling.

There are currently no known users of SO_PASSCRED that would actually receive
messages from gVisor, but adding full support is barely more work than stubbing
out fake support.

Updates #1117
Fixes #1119

PiperOrigin-RevId: 277981465
2019-11-01 12:45:11 -07:00
Nicolas Lacasse 2a709a1b7b Add "manual" tag back to runtime tests.
PiperOrigin-RevId: 277971910
2019-11-01 11:53:47 -07:00
Andrei Vagin af6af2c341 tests: don't use ASSERT_THAT after fork
PiperOrigin-RevId: 277965624
2019-11-01 11:22:21 -07:00
Brad Burlage df125c9869 Add Kokoro config for new runtime tests
PiperOrigin-RevId: 277607217
2019-10-30 16:16:15 -07:00
Andrei Vagin db37483cb6 Store endpoints inside multiPortEndpoint in a sorted order
It is required to guarantee the same order of endpoints after save/restore.

PiperOrigin-RevId: 277598665
2019-10-30 15:33:41 -07:00
Dean Deng 8bc7b8dba2 Clean up typos in test names.
PiperOrigin-RevId: 277572791
2019-10-30 13:31:12 -07:00
Dean Deng 38330e9377 Update symlink traversal limit when resolving interpreter path.
When execveat is called on an interpreter script, the symlink count for
resolving the script path should be separate from the count for resolving the
the corresponding interpreter. An ELOOP error should not occur if we do not hit
the symlink limit along any individual path, even if the total number of
symlinks encountered exceeds the limit.

Closes #574

PiperOrigin-RevId: 277358474
2019-10-29 13:59:28 -07:00
Bhasker Hariharan 392c561495 Fix PollWithFullBufferBlocks.
Set the snd/rcv buffer sizes so that the test is deterministic and runs in a
reasonable amount of time. It also ensures that we disable any auto-tuning of
the send/receive buffer which may happen.

PiperOrigin-RevId: 277337232
2019-10-29 12:17:06 -07:00
Dean Deng 29273b0384 Disallow execveat on interpreter scripts with fd opened with O_CLOEXEC.
When an interpreter script is opened with O_CLOEXEC and the resulting fd is
passed into execveat, an ENOENT error should occur (the script would otherwise
be inaccessible to the interpreter). This matches the actual behavior of
Linux's execveat.

PiperOrigin-RevId: 277306680
2019-10-29 10:04:39 -07:00
Fabricio Voznika dbeaf9d4db Deflake TestCheckpointRestore
PiperOrigin-RevId: 277189064
2019-10-28 18:50:04 -07:00
Haibo e0c84f284c test/syscall: Remove duplicated gtest/gtest.h.
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: I05a7ec69b98b88931ba4a8adb3e8a7b822006001
COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/1023 from xiaobo55x:syscall_test d44a8b1f827ed4081997af96cd58ba7449e0a9e1
PiperOrigin-RevId: 276740442
2019-10-25 12:40:36 -07:00
Fabricio Voznika e8ba10c008 Fix early deletion of rootDir
container.startContainers() cannot be called twice in a test
(e.g. TestMultiContainerLoadSandbox) because the cleanup
function deletes the rootDir, together with information from
all other containers that may exist.

PiperOrigin-RevId: 276591806
2019-10-24 16:36:54 -07:00
Dean Deng d9fd536340 Handle AT_SYMLINK_NOFOLLOW flag for execveat.
PiperOrigin-RevId: 276441249
2019-10-24 01:45:25 -07:00
Dean Deng 7ca50236c4 Handle AT_EMPTY_PATH flag in execveat.
PiperOrigin-RevId: 276419967
2019-10-23 22:23:05 -07:00
Kevin Krakauer 072af49059 Add check for proper settings to AF_PACKET tests.
As in packet_socket_raw.cc, we should check that certain proc files are set
correctly.

PiperOrigin-RevId: 276384534
2019-10-23 17:21:12 -07:00
gVisor bot 6d4d9564e3 Merge pull request #641 from tanjianfeng:master
PiperOrigin-RevId: 276380008
2019-10-23 16:55:15 -07:00
Michael Pratt c0065e296f Remove comparison between signed and unsigned int
Some compilers don't like the comparison between int and size_t. Remove it.

The other changes are minor style cleanups.

PiperOrigin-RevId: 276333450
2019-10-23 12:59:48 -07:00
Dean Deng 0b569b7cae Add basic implementation of execveat syscall and associated tests.
Allow file descriptors of directories as well as AT_FDCWD.

PiperOrigin-RevId: 275929668
2019-10-21 14:55:18 -07:00
Kevin Krakauer 12235d533a AF_PACKET support for netstack (aka epsocket).
Like (AF_INET, SOCK_RAW) sockets, AF_PACKET sockets require CAP_NET_RAW. With
runsc, you'll need to pass `--net-raw=true` to enable them.

Binding isn't supported yet.

PiperOrigin-RevId: 275909366
2019-10-21 13:23:18 -07:00
Fabricio Voznika 74044f2cca Add more instructions to test/README.md
PiperOrigin-RevId: 275565958
2019-10-18 16:18:52 -07:00
Michael Pratt 49b596b98d Cleanup host UDS support
This change fixes several issues with the fsgofer host UDS support. Notably, it
adds support for SOCK_SEQPACKET and SOCK_DGRAM sockets [1]. It also fixes
unsafe use of unet.Socket, which could cause a panic if Socket.FD is called
when err != nil, and calls to Socket.FD with nothing to prevent the garbage
collector from destroying and closing the socket.

A set of tests is added to exercise host UDS access. This required extracting
most of the syscall test runner into a library that can be used by custom
tests.

Updates #235
Updates #1003

[1] N.B. SOCK_DGRAM sockets are likely not particularly useful, as a server can
only reply to a client that binds first. We don't allow bind, so these are
unlikely to be used.

PiperOrigin-RevId: 275558502
2019-10-18 15:33:03 -07:00
Andrei Vagin 8ae70f864d test/perf: optimize the getdents test
* Use mknod instead of open&close to create an empty file.
* Limit a number of files to (1<<16) instead of 100K.

In this case, a test set is (1, 8, 64, 512, 4K, 32K, 64K) instead of (1, 8, 64,
512, 4K, 32K, 98K). I think it is easier to compare results for 32K and 64K
than 32K and 98K. And results for 98K doesn't give us more information than for
54K.

PiperOrigin-RevId: 275552507
2019-10-18 15:01:40 -07:00
Andrei Vagin 4c7f849b25 test: use a bigger buffer to fill a socket
Otherwise we need to do a lot of system calls and cooperative_save tests work
slow.

PiperOrigin-RevId: 275536957
2019-10-18 13:40:31 -07:00
gVisor bot d22f0534c0 Merge pull request #736 from tanjianfeng:fix-unix
PiperOrigin-RevId: 275114157
2019-10-16 14:41:43 -07:00
Michael Pratt de9a8e0eb7 Remove death from exec test names
These aren't actually death tests in the GUnit sense. i.e., they don't call
EXPECT_EXIT or EXPECT_DEATH.

PiperOrigin-RevId: 275099957
2019-10-16 13:25:11 -07:00
Jianfeng Tan d277bfba27 epsocket: support /proc/net/snmp
Netstack has its own stats, we use this to fill /proc/net/snmp.

Note that some metrics are not recorded in Netstack, which will be shown
as 0 in the proc file.

Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
Change-Id: Ie0089184507d16f49bc0057b4b0482094417ebe1
2019-10-15 16:38:41 +00:00
Jianfeng Tan e3d4a67739 support /proc/net/snmp
This proc file contains statistics according to [1].

[1] https://tools.ietf.org/html/rfc2013

Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
Change-Id: I9662132085edd8a7783d356ce4237d7ac0800d94
2019-10-15 16:38:40 +00:00
gVisor bot bfa0bb24dd Internal change.
PiperOrigin-RevId: 274700093
2019-10-14 17:46:52 -07:00
Ian Lewis 470997ca99 Allow for zero byte iovec with MSG_PEEK | MSG_TRUNC in recvmsg.
This allows for peeking at the length of the next message on a netlink socket
without pulling it off the socket's buffer/queue, allowing tools like 'ip' to
work.

This CL also fixes an issue where dump_done_errno was not included in the
NLMSG_DONE messages payload.

Issue #769

PiperOrigin-RevId: 274068637
2019-10-10 16:55:48 -07:00
Adin Scannell f8b1859319 Fix signalfd polling.
The signalfd descriptors otherwise always show as available. This can lead
programs to spin, assuming they are looking to see what signals are pending.

Updates #139

PiperOrigin-RevId: 274017890
2019-10-10 12:51:22 -07:00
Nicolas Lacasse f1061aabaf Add blacklists for remaining runtime tests, and test that they parse correctly.
PiperOrigin-RevId: 273781112
2019-10-09 11:22:53 -07:00
Ian Gudger 7c1587e340 Implement IP_TTL.
Also change the default TTL to 64 to match Linux.

PiperOrigin-RevId: 273430341
2019-10-07 19:29:51 -07:00
Ian Lewis da9e18f24d Add tests for $HOME
Adds two tests. One to make sure that $HOME is set when starting a container
via 'docker run' and one to make sure that $HOME is set for each container in a
multi-container sandbox.

Issue #701

PiperOrigin-RevId: 273395763
2019-10-07 15:55:39 -07:00
Chris Kuiper 4874525161 Implement proper local broadcast behavior
The behavior for sending and receiving local broadcast (255.255.255.255)
traffic is as follows:

Outgoing
--------
* A broadcast packet sent on a socket that is bound to an interface goes out
  that interface
* A broadcast packet sent on an unbound socket follows the route table to
  select the outgoing interface
  + if an explicit route entry exists for 255.255.255.255/32, use that one
  + else use the default route
* Broadcast packets are looped back and delivered following the rules for
  incoming packets (see next). This is the same behavior as for multicast
  packets, except that it cannot be disabled via sockopt.

Incoming
--------
* Sockets wishing to receive broadcast packets must bind to either INADDR_ANY
  (0.0.0.0) or INADDR_BROADCAST (255.255.255.255). No other socket receives
  broadcast packets.
* Broadcast packets are multiplexed to all sockets matching it. This is the
  same behavior as for multicast packets.
* A socket can bind to 255.255.255.255:<port> and then receive its own
  broadcast packets sent to 255.255.255.255:<port>

In addition, this change implicitly fixes an issue with multicast reception. If
two sockets want to receive a given multicast stream and one is bound to ANY
while the other is bound to the multicast address, only one of them will
receive the traffic.

PiperOrigin-RevId: 272792377
2019-10-03 19:31:35 -07:00
Andrei Vagin db218fdfcf Don't report partialResult errors from sendfile
The input file descriptor is always a regular file, so sendfile can't lose any
data if it will not be able to write them to the output file descriptor.

Reported-by: syzbot+22d22330a35fa1c02155@syzkaller.appspotmail.com
PiperOrigin-RevId: 272730357
2019-10-03 13:38:30 -07:00
Michael Pratt 0bf8e90719 Increase itimer test timeout
dd69b49ed1
makes this test take longer.

PiperOrigin-RevId: 272535892
2019-10-02 15:44:20 -07:00
gVisor bot cde7711837 Merge pull request #865 from tanjianfeng:fix-829
PiperOrigin-RevId: 272522508
2019-10-02 14:51:04 -07:00
Michael Pratt 61e40819d9 Sanity test that open(2) on a UDS fails
Spoiler alert: it doesn't.

PiperOrigin-RevId: 272513529
2019-10-02 14:01:49 -07:00
Michael Pratt 0d483985c5 Include AT_SECURE in the aux vector
gVisor does not currently implement the functionality that would result in
AT_SECURE = 1, but Linux includes AT_SECURE = 0 in the normal case, so we
should do the same.
PiperOrigin-RevId: 272311488
2019-10-01 15:43:14 -07:00
Nicolas Lacasse 103a3906b0 Add blacklist support to the runtime test runner.
Tests in the blacklist will be explicitly skipped (with associated log line).

Checks in a blacklist for the nodejs tests.

PiperOrigin-RevId: 272272749
2019-10-01 12:49:12 -07:00
Michael Pratt 277f84ad20 Support new interpreter requirements in test
Refactoring in 0036d1f7eb95bcc52977f15507f00dd07018e7e2 (v4.10) caused Linux to
start unconditionally zeroing the remainder of the last page in the
interpreter. Previously it did not due so if filesz == memsz, and *still* does
not do so when filesz == memsz for loading binaries, only interpreter.

This inconsistency is not worth replicating in gVisor, as it is arguably a bug,
but our tests must ensure we create interpreter ELFs compatible with this new
requirement.

PiperOrigin-RevId: 272266401
2019-10-01 12:25:11 -07:00
Michael Pratt dd69b49ed1 Disable cpuClockTicker when app is idle
Kernel.cpuClockTicker increments kernel.cpuClock, which tasks use as a clock to
track their CPU usage. This improves latency in the syscall path by avoid
expensive monotonic clock calls on every syscall entry/exit.

However, this timer fires every 10ms. Thus, when all tasks are idle (i.e.,
blocked or stopped), this forces a sentry wakeup every 10ms, when we may
otherwise be able to sleep until the next app-relevant event. These wakeups
cause the sentry to utilize approximately 2% CPU when the application is
otherwise idle.

Updates to clock are not strictly necessary when the app is idle, as there are
no readers of cpuClock. This commit reduces idle CPU by disabling the timer
when tasks are completely idle, and computing its effects at the next wakeup.

Rather than disabling the timer as soon as the app goes idle, we wait until the
next tick, which provides a window for short sleeps to sleep and wakeup without
doing the (relatively) expensive work of disabling and enabling the timer.

PiperOrigin-RevId: 272265822
2019-10-01 12:21:01 -07:00
Fabricio Voznika 0b02c3d5e5 Prevent CAP_NET_RAW from appearing in exec
'docker exec' was getting CAP_NET_RAW even when --net-raw=false
because it was not filtered out from when copying container's
capabilities.

PiperOrigin-RevId: 272260451
2019-10-01 11:49:49 -07:00
Michael Pratt 53cc72da90 Honor X bit on extra anon pages in PT_LOAD segments
Linux changed this behavior in 16e72e9b30986ee15f17fbb68189ca842c32af58
(v4.11). Previously, extra pages were always mapped RW. Now, those pages will
be executable if the segment specified PF_X. They still must be writeable.

PiperOrigin-RevId: 272256280
2019-10-01 11:30:36 -07:00
Kevin Krakauer c06cca6678 De-flake SetForegroundProcessGroupDifferentSession.
PiperOrigin-RevId: 272059043
2019-09-30 13:59:36 -07:00
Michael Pratt 981fc188f0 Only copy out remaining time on nanosleep success
It looks like the old code attempted to do this, but didn't realize that err !=
nil even in the happy case.

PiperOrigin-RevId: 272005887
2019-09-30 13:07:32 -07:00
Adin Scannell c8bb20865d Automated rollback of changelist 256276198
PiperOrigin-RevId: 271665517
2019-09-27 15:58:51 -07:00
gVisor bot 8539abc0df Merge pull request #864 from tanjianfeng:fix-861
PiperOrigin-RevId: 271649711
2019-09-27 15:18:09 -07:00
gVisor bot abbee5615f Implement SO_BINDTODEVICE sockopt
PiperOrigin-RevId: 271644926
2019-09-27 14:14:04 -07:00
Kevin Krakauer 543492650d Make raw socket tests pass in environments with or without CAP_NET_RAW.
PiperOrigin-RevId: 271442321
2019-09-26 15:09:20 -07:00
Andrei Vagin 2fb34c8d5c test: don't use designated initializers
This change fixes compile errors:
pty.cc:1460:7: error: expected primary-expression before '.' token
...

PiperOrigin-RevId: 271033729
2019-09-24 19:05:12 -07:00
Adin Scannell 502f8f238e Stub out readahead implementation.
Closes #261

PiperOrigin-RevId: 270973347
2019-09-24 13:29:46 -07:00
Nicolas Lacasse d5b3dd7cb4 Run all runtime tests in a single container.
This makes them run much faster. Also cleaned up the log reporting.

PiperOrigin-RevId: 270799808
2019-09-23 17:43:42 -07:00
Nicolas Lacasse f2ea8e6b24 Always set HOME env var with `runsc exec`.
We already do this for `runsc run`, but need to do the same for `runsc exec`.

PiperOrigin-RevId: 270793459
2019-09-23 17:06:02 -07:00
Bhasker Hariharan 9846da5e65 Fix bug in RstCausesPollHUP.
The test is checking the wrong poll_fd for POLLHUP. The only
reason it passed till now was because it was also checking
for POLLIN which was always true on the other fd from the
previous poll!

PiperOrigin-RevId: 270780401
2019-09-23 16:00:50 -07:00
Nicolas Lacasse 112736c579 Add test that runsc exec inherits the same environment as run.
PiperOrigin-RevId: 270764996
2019-09-23 14:47:30 -07:00
Jianfeng Tan 223481e927 fix set hostname
Previously, when we set hostname:

$ strace hostname abc
...
sethostname("abc", 3) = -1 ENAMETOOLONG (File name too long)
...

According to man 2 sethostname:

"The len argument specifies the number of bytes in name. (Thus, name
does not require a terminating null byte.)"

We wrongly use the CopyStringIn() to check terminating zero byte in
the implementation of sethostname syscall.

To fix this, we use CopyInBytes() instead.

Fixes: #861

Reported-by: chenglang.hy <chenglang.hy@antfin.com>
Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
2019-09-20 17:57:25 +00:00
Jianfeng Tan 329b6653ff Implement /proc/net/tcp6
Fixes: #829

Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
Signed-off-by: Jielong Zhou <jielong.zjl@antfin.com>
2019-09-20 17:20:08 +00:00
Kevin Krakauer 0a8a75f3da Job control: controlling TTYs and foreground process groups.
Adresses a deadlock with the rolled back change:
b6a5b950d2
Creating a session from an orphaned process group was causing a lock to be
acquired twice by a single goroutine. This behavior is addressed, and a test
(OrphanRegression) has been added to pty.cc.

Implemented the following ioctls:
- TIOCSCTTY - set controlling TTY
- TIOCNOTTY - remove controlling tty, maybe signal some other processes
- TIOCGPGRP - get foreground process group. Also enables tcgetpgrp().
- TIOCSPGRP - set foreground process group. Also enabled tcsetpgrp().

Next steps are to actually turn terminal-generated control characters (e.g. C^c)
into signals to the proper process groups, and to send SIGTTOU and SIGTTIN when
appropriate.

PiperOrigin-RevId: 270088599
2019-09-19 11:36:47 -07:00
Nicolas Lacasse 28f431335b Shard the runtime tests.
Default of 20 shards was arbitrary and will need fine-tuning in later CLs.

PiperOrigin-RevId: 269922871
2019-09-18 17:04:53 -07:00
Adin Scannell c98e7f0d19 Signalfd support
Note that the exact semantics for these signalfds are slightly different from
Linux. These signalfds are bound to the process at creation time. Reads, polls,
etc. are all associated with signals directed at that task. In Linux, all
signalfd operations are associated with current, regardless of where the
signalfd originated.

In practice, this should not be an issue given how signalfds are used. In order
to fix this however, we will need to plumb the context through all the event
APIs. This gets complicated really quickly, because the waiter APIs are all
netstack-specific, and not generally exposed to the context.  Probably not
worthwhile fixing immediately.

PiperOrigin-RevId: 269901749
2019-09-18 15:16:42 -07:00
Nicolas Lacasse 062190d983 Follow-up fixes for image tests.
- Fix ARG syntax in Dockerfiles.
- Fix curl commands in Dockerfiles.
- Fix some paths in proctor binaries.
- Check error from Walk in search helper.

PiperOrigin-RevId: 269641686
2019-09-17 13:29:19 -07:00
Nicolas Lacasse 24b7eb2f86 Refactor and clean up image tests.
* Use multi-stage builds in Dockerfiles.
* Combine all proctor binaries into a single binary.
* Change the TestRunner interface to reduce code duplication.

PiperOrigin-RevId: 269462101
2019-09-16 17:51:22 -07:00
Michael Pratt 56cb004218 Migrate from gflags to absl flags
absl flags are more modern and we can easily depend on them directly.

The repo now successfully builds with --incompatible_load_cc_rules_from_bzl.

PiperOrigin-RevId: 269387081
2019-09-16 11:58:27 -07:00
Fabricio Voznika 010b093258 Bring back to life features lost in recent refactor
- Sandbox logs are generated when running tests
- Kokoro uploads the sandbox logs
- Supports multiple parallel runs
- Revive script to install locally built runsc with docker

PiperOrigin-RevId: 269337274
2019-09-16 08:17:00 -07:00
Andrei Vagin 239a07aabf gvisor: return ENOTDIR from the unlink syscall
ENOTDIR has to be returned when a component used as a directory in
pathname is not, in  fact,  a directory.

PiperOrigin-RevId: 269037893
2019-09-13 21:44:57 -07:00
Adin Scannell 7c6ab6a219 Implement splice methods for pipes and sockets.
This also allows the tee(2) implementation to be enabled, since dup can now be
properly supported via WriteTo.

Note that this change necessitated some minor restructoring with the
fs.FileOperations splice methods. If the *fs.File is passed through directly,
then only public API methods are accessible, which will deadlock immediately
since the locking is already done by fs.Splice. Instead, we pass through an
abstract io.Reader or io.Writer, which elide locks and use the underlying
fs.FileOperations directly.

PiperOrigin-RevId: 268805207
2019-09-12 17:43:27 -07:00
Adin Scannell 849c57314f Fix minor Kokoro issues.
A recent Kokoro change pointed to go_tests.cfg (in line with the
other configurations), which unfortunately broke the presubmits.

This change also enabled the KVM tests, which were still using a
remote execution strategy.

This fixes both of these issues and allows presubmits to pass.

One additional test was caught with this case, which seems to
have been broken. It's unclear why this was not being caught.

PiperOrigin-RevId: 268166291
2019-09-10 00:38:52 -07:00
Michael Pratt 98f7fbb59f Load C++ rules from @rules_cc
See https://github.com/bazelbuild/bazel/issues/8743. This will be required in
Bazel 1.0.

Protobuf was updated in
bf0c69e130 (diff-96239ee297e0a92ac6ff96a6bc434ef0).

GoogleTest was updated in
6fd262ecf7.

gflags has not yet been updated, so the repo still won't build with
--incompatible_load_cc_rules_from_bzl.

Tested with buildifier -warnings=native-cc -lint=warn **/BUILD.

PiperOrigin-RevId: 267638515
2019-09-06 11:29:00 -07:00
Ian Lewis 0bfffbcb01 Ignore the root container when calculating oom_score_adj for the sandbox.
This is done because the root container for CRI is the infrastructure (pause)
container and always gets a low oom_score_adj. We do this to ensure that only
the oom_score_adj of user containers is used to calculated the sandbox
oom_score_adj.

Implemented in runsc rather than the containerd shim as it's a bit cleaner to
implement here (in the shim it would require overwriting the oomScoreAdj and
re-writing out the config.json again). This processing is Kubernetes(CRI)
specific but we are currently only supporting CRI for multi-container support
anyway.

PiperOrigin-RevId: 267507706
2019-09-05 19:21:25 -07:00
Bhasker Hariharan eb074a61f2 Fix bug in proc_test.
TestNoDuplicates is racy as it tries to read the /proc file system
while the test is running. But it's possible that from the time a
directory entries are read and each entry processed something could
change and in some cases the entry being processed could have been
deleted. In such cases we should not fail the test but just
ignore the error and move on.

PiperOrigin-RevId: 267483094
2019-09-05 16:40:46 -07:00
Jamie Liu fbdd3ff1da Deflake aio_test.
- Most AIO tests call io_setup(nr_events = 128). sizeof(struct io_event)
(128*32 = 4096). However, the actual size of the mapping created by
io_setup() is determined by:

(from fs/aio.c:ioctx_alloc())
/*
 * We keep track of the number of available ringbuffer slots, to prevent
 * overflow (reqs_available), and we also use percpu counters for this.
 *
 * So since up to half the slots might be on other cpu's percpu counters
 * and unavailable, double nr_events so userspace sees what they
 * expected: additionally, we move req_batch slots to/from percpu
 * counters at a time, so make sure that isn't 0:
 */
nr_events = max(nr_events, num_possible_cpus() * 4);
nr_events *= 2;

(from fs/aio.c:aio_setup_ring())
/* Compensate for the ring buffer's head/tail overlap entry */
nr_events += 2; /* 1 is required, 2 for good luck */
size = sizeof(struct aio_ring);
size += sizeof(struct io_event) * nr_events;
nr_pages = PFN_UP(size);

When we mremap() only the first page of a multi-page AIO ring buffer
mapping, fs/aio.c:aio_ring_mremap() updates struct kioctx::mmap_base -
but struct kioctx::mmap_size is untouched, so sys_io_destroy() =>
kill_ioctx() vm_unmaps() the mremapped page, plus some number of pages
after it. Just get the actual size of the mapping from /proc/self/maps.

- Delete test case MremapOver; while it is correct that Linux will not
complain if you overwrite the AIO ring buffer with another mapping, it
won't actually work in the sense that AIO events will not be written to
the new mapping, because Linux stores the struct pages of the ring
buffer in struct kioctx::ring_pages and writes to those through kmap()
rather than using userspace addresses.

- Don't munmap() after mremap(MREMAP_FIXED) returns EFAULT; see new
comment in factored-out test case MremapExpansion.

PiperOrigin-RevId: 267482903
2019-09-05 16:36:44 -07:00
Ian Gudger fbbb2f7ed6 Run proc_net tests.
PiperOrigin-RevId: 267280086
2019-09-04 19:08:12 -07:00
Adin Scannell 67a2ab1438 Impose order on test scripts.
The simple test script has gotten out of control. Shard this script into
different pieces and attempt to impose order on overall test structure. This
change helps lay some of the foundations for future improvements.

 * The runsc/test directories are moved into just test/.
 * The runsc/test/testutil package is split into logical pieces.
 * The scripts/ directory contains new top-level targets.
 * Each test is now responsible for building targets it requires.
 * The install functionality is moved into `runsc` itself for simplicity.
 * The existing kokoro run_tests.sh file now just calls all (can be split).

After this change is merged,  I will create multiple distinct workflows for
Kokoro, one for each of the scripts currently targeted by `run_tests.sh` today,
which should dramatically reduce the time-to-run for the Kokoro tests, and
provides a better foundation for further improvements to the infrastructure.

PiperOrigin-RevId: 267081397
2019-09-03 22:02:43 -07:00
Bhasker Hariharan 54bf2e8eff Automated rollback of changelist 261387276
PiperOrigin-RevId: 266491264
2019-08-30 18:15:32 -07:00
Jamie Liu f3dabdfc48 Fix async-signal-unsafety in MlockallTest_Future.
PiperOrigin-RevId: 266491246
2019-08-30 18:11:15 -07:00
Fabricio Voznika 502c47f7a7 Return correct buffer size for ioctl(socket, FIONREAD)
Ioctl was returning just the buffer size from epsocket.endpoint
and it was not considering data from epsocket.SocketOperations
that was read from the endpoint, but not yet sent to the caller.

PiperOrigin-RevId: 266485461
2019-08-30 17:19:09 -07:00
Adin Scannell 888e87909e Add C++ toolchain and fix compile issues.
This was accidentally introduced in 31f05d5d4f.

Fixes #788.

PiperOrigin-RevId: 266462843
2019-08-30 15:03:15 -07:00
Rahat Mahmood f74affe203 Handle new representation of abstract UDS paths.
When abstract unix domain socket paths are displayed in
/proc/net/unix, Linux historically emitted null bytes as padding at
the end of the path. Newer versions of Linux (v4.9,
e7947ea770d0de434d38a0f823e660d3fd4bebb5) display these as '@'
characters.

Update proc_net_unix test to handle both version of the padding.

PiperOrigin-RevId: 266230200
2019-08-29 14:37:47 -07:00
Rahat Mahmood 863e11ac4d Implement /proc/net/udp.
PiperOrigin-RevId: 266229756
2019-08-29 14:30:41 -07:00
Nicolas Lacasse eb4aa40342 Compile procter binaries during image creation.
Using "go run ..." in the ENTRYPOINT causes the go compiler to run each time
the container is started. We can just compile the binary once as part of the
image.

PiperOrigin-RevId: 266212462
2019-08-29 14:02:32 -07:00
gVisor bot 31f05d5d4f Internal change.
PiperOrigin-RevId: 266199211
2019-08-29 14:01:47 -07:00
Zach Koopmans f64d9a7d93 Fix pwritev2 flaky test.
Fix a uninitialized memory bug in pwritev2 test.

PiperOrigin-RevId: 265772176
2019-08-27 14:50:03 -07:00
Fabricio Voznika 8fd89fd7a2 Fix sendfile(2) error code
When output file is in append mode, sendfile(2) should fail
with EINVAL and not EBADF.

Closes #721

PiperOrigin-RevId: 265718958
2019-08-27 10:52:46 -07:00
gVisor bot baf4d8aaca Internal change.
PiperOrigin-RevId: 265535438
2019-08-26 14:07:17 -07:00
Zach Koopmans a5d0115943 Second try at flaky futex test.
The flake had the call to futex_unlock_pi() returning EINVAL with the
FUTEX_OWNER_DIED set. In this case, userspace has to clean up stale
state. So instead of calling FUTEX_UNLOCK_PI outright, we'll use the
advised atomic compare_exchange as advised in the man page.

PiperOrigin-RevId: 265163920
2019-08-23 16:54:18 -07:00
Andrei Vagin 0e82f9f3fb test: set shard_count to 5 by default
In cl/264434674 and cl/264498919, we stop running test cases
in parallel to not overload test hosts. But now tests requires
more time to run, so we need to increase a default number of
shards or a default test timeout. Let's start with increasing
the number of shards and see how it will works.

PiperOrigin-RevId: 264917055
2019-08-22 14:16:31 -07:00
Michael Pratt 52e674b44d Remove ASSERT from fork child
The gunit macros are not safe to use in the child.

PiperOrigin-RevId: 264904348
2019-08-22 13:21:04 -07:00
Jianfeng Tan 2c3e2ed2bf unix: return ECONNRESET if peer closed with data not read
For SOCK_STREAM type unix socket, we shall return ECONNRESET if peer is
closed with data not read.

We explictly set a flag when closing one end, to differentiate from
just shutdown (where zero shall be returned).

Fixes: #735

Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
2019-08-22 15:25:38 +00:00
Jianfeng Tan 96f78e2466 unix: return zero if peer is closed
Previously, recvmsg() on a unix stream socket with its peer closed will
never return, with goroutine call trace like this:

  ...
  2  in gvisor.dev/gvisor/pkg/sentry/kernel.(*Task).block
     at pkg/sentry/kernel/task_block.go:124
  3  in gvisor.dev/gvisor/pkg/sentry/kernel.(*Task).BlockWithDeadline
     at pkg/sentry/kernel/task_block.go:69
  4  in gvisor.dev/gvisor/pkg/sentry/socket/unix.(*SocketOperations).RecvMsg
     at pkg/sentry/socket/unix/unix.go:612
  5  in gvisor.dev/gvisor/pkg/sentry/syscalls/linux.recvFrom
     at pkg/sentry/syscalls/linux/sys_socket.go:885
  6  in gvisor.dev/gvisor/pkg/sentry/syscalls/linux.RecvFrom
     at pkg/sentry/syscalls/linux/sys_socket.go:910
  ...

The issue is caused by that ErrClosedForReceive returned by
unix/transport.queue is turned into nil in
unix.(*EndpointReader).ReadToBlocks():

  err.ToError()

As a result, in unix.(*SocketOperations).RecvMsg():

  n == 0 and err == nil

We shall differentiate it from another case - no data to read where
ErrWouldBlock shall be returned; and return 0 immediately.

Fixes: #734

Reported-by: chenglang.hy <chenglang.hy@antfin.com>
Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
2019-08-22 15:25:38 +00:00
Chris Kuiper 8d9276ed56 Support binding to multicast and broadcast addresses
This fixes the issue of not being able to bind to either a multicast or
broadcast address as well as to send and receive data from it. The way to solve
this is to treat these addresses similar to the ANY address and register their
transport endpoint ID with the global stack's demuxer rather than the NIC's.
That way there is no need to require an endpoint with that multicast or
broadcast address. The stack's demuxer is in fact the only correct one to use,
because neither broadcast- nor multicast-bound sockets care which NIC a
packet was received on (for multicast a join is still needed to receive packets
on a NIC).

I also took the liberty of refactoring udp_test.go to consolidate a lot of
duplicate code and make it easier to create repetitive tests that test the same
feature for a variety of packet and socket types. For this purpose I created a
"flowType" that represents two things: 1) the type of packet being sent or
received and 2) the type of socket used for the test. E.g., a "multicastV4in6"
flow represents a V4-mapped multicast packet run through a V6-dual socket.

This allows writing significantly simpler tests. A nice example is testTTL().

PiperOrigin-RevId: 264766909
2019-08-21 22:54:25 -07:00
Andrei Vagin 5fd63d1c7f tests: retry connect if it fails with EINTR
test/syscalls/linux/proc_net_tcp.cc:252: Failure
 Value of: connect(client->get(), &addr, addrlen)
 Expected: not -1 (success)
   Actual: -1 (of type int), with errno PosixError(errno=4 Interrupted system call)

PiperOrigin-RevId: 264743815
2019-08-21 19:07:11 -07:00
Andrei Vagin 7609da6cb9 test: reset a signal handler before closing a signal channel
goroutine 5 [running]:
os/signal.process(0x10e21c0, 0xc00050c280)
        third_party/go/gc/src/os/signal/signal.go:227 +0x164
os/signal.loop()
        third_party/go/gc/src/os/signal/signal_unix.go:23 +0x3e
created by os/signal.init.0
        third_party/go/gc/src/os/signal/signal_unix.go:29 +0x41

PiperOrigin-RevId: 264518530
2019-08-20 19:11:22 -07:00
Nicolas Lacasse 8b7e7a04d6 Don't run runtime tests in parallel.
We need real sharding, and will let Bazel handle the
parallelization. That is coming soon. Until then, remove
this call to t.Parallel() so that we can run the tests without
eating all CPU.

PiperOrigin-RevId: 264498919
2019-08-20 16:59:09 -07:00
Kevin Krakauer 6c3a242143 Add tests for raw AF_PACKET sockets.
PiperOrigin-RevId: 264494359
2019-08-20 16:36:06 -07:00
Zach Koopmans 3d0715b3f8 Fix flaky futex test.
The test is long running (175128 ms or so) which causes timeouts.
The test simply makes sure that private futexes can acquire
locks concurrently. Dropping current threads and increasing the
number of locks each thread tests the same concurrency concerns
but drops execution time to ~1411 ms.

PiperOrigin-RevId: 264476144
2019-08-20 15:06:54 -07:00
Andrei Vagin cf8a689be7 tests: syscall_test_runner should not run tests in parallel
bazel runs a few instances of syscall_test_runner in parallel
and then syscall_test_runner runs test cases in parallel. It might
be a reason why we see that test hosts are overloaded and sandboxes
start slowly. It should be better to control how many tests are
running in parallel from one place, so let's try to disable this
feature in syscall_test_runner.

PiperOrigin-RevId: 264434674
2019-08-20 12:00:20 -07:00
Kevin Krakauer bd826092fe Read iptables via sockopts.
PiperOrigin-RevId: 264180125
2019-08-19 10:05:59 -07:00
Andrei Vagin 3e4102b2ea netstack: disconnect an unix socket only if the address family is AF_UNSPEC
Linux allows to call connect for ANY and the zero port.

PiperOrigin-RevId: 263892534
2019-08-16 19:32:14 -07:00
Kevin Krakauer ef045b914b Add tests for "cooked" AF_PACKET sockets.
PiperOrigin-RevId: 263666789
2019-08-15 16:31:35 -07:00