Commit Graph

1848 Commits

Author SHA1 Message Date
Andrei Vagin 2016cc283c fs/proc: report PID-s from a pid namespace of the proc mount
Right now, we can find more than one process with the 1 PID in /proc.

$ for i in `seq 10`; do
> unshare -fp sleep 1000 &
> done

$ ls /proc
1  1  1  1  12  18  24  29  6            loadavg  net   sys          version
1  1  1  1  16  20  26  32  cpuinfo      meminfo  self  thread-self
1  1  1  1  17  21  28  36  filesystems  mounts   stat  uptime

PiperOrigin-RevId: 272506593
2019-10-02 13:29:42 -07:00
Andrei Vagin 9a875306db
Merge branch 'master' into pr_syscall_linux 2019-10-02 13:00:07 -07:00
Michael Pratt 03ce4dd86c Remove extra --rm
PiperOrigin-RevId: 272324038
2019-10-01 16:45:46 -07:00
Andrei Vagin 29207cef14 runsc: remove todo from the build file
b/135475885 was fixed by cl/271434565.

PiperOrigin-RevId: 272320178
2019-10-01 16:25:34 -07:00
Michael Pratt 0d483985c5 Include AT_SECURE in the aux vector
gVisor does not currently implement the functionality that would result in
AT_SECURE = 1, but Linux includes AT_SECURE = 0 in the normal case, so we
should do the same.
PiperOrigin-RevId: 272311488
2019-10-01 15:43:14 -07:00
Fabricio Voznika 739f53fc17 Add runsc logs to kokoro artifacts
PiperOrigin-RevId: 272286122
2019-10-01 13:51:10 -07:00
Nicolas Lacasse 103a3906b0 Add blacklist support to the runtime test runner.
Tests in the blacklist will be explicitly skipped (with associated log line).

Checks in a blacklist for the nodejs tests.

PiperOrigin-RevId: 272272749
2019-10-01 12:49:12 -07:00
Michael Pratt 277f84ad20 Support new interpreter requirements in test
Refactoring in 0036d1f7eb95bcc52977f15507f00dd07018e7e2 (v4.10) caused Linux to
start unconditionally zeroing the remainder of the last page in the
interpreter. Previously it did not due so if filesz == memsz, and *still* does
not do so when filesz == memsz for loading binaries, only interpreter.

This inconsistency is not worth replicating in gVisor, as it is arguably a bug,
but our tests must ensure we create interpreter ELFs compatible with this new
requirement.

PiperOrigin-RevId: 272266401
2019-10-01 12:25:11 -07:00
Michael Pratt dd69b49ed1 Disable cpuClockTicker when app is idle
Kernel.cpuClockTicker increments kernel.cpuClock, which tasks use as a clock to
track their CPU usage. This improves latency in the syscall path by avoid
expensive monotonic clock calls on every syscall entry/exit.

However, this timer fires every 10ms. Thus, when all tasks are idle (i.e.,
blocked or stopped), this forces a sentry wakeup every 10ms, when we may
otherwise be able to sleep until the next app-relevant event. These wakeups
cause the sentry to utilize approximately 2% CPU when the application is
otherwise idle.

Updates to clock are not strictly necessary when the app is idle, as there are
no readers of cpuClock. This commit reduces idle CPU by disabling the timer
when tasks are completely idle, and computing its effects at the next wakeup.

Rather than disabling the timer as soon as the app goes idle, we wait until the
next tick, which provides a window for short sleeps to sleep and wakeup without
doing the (relatively) expensive work of disabling and enabling the timer.

PiperOrigin-RevId: 272265822
2019-10-01 12:21:01 -07:00
gVisor bot 90e908f419 Merge pull request #917 from KentaTada:fix-clone-flags
PiperOrigin-RevId: 272262368
2019-10-01 12:06:30 -07:00
Fabricio Voznika 0b02c3d5e5 Prevent CAP_NET_RAW from appearing in exec
'docker exec' was getting CAP_NET_RAW even when --net-raw=false
because it was not filtered out from when copying container's
capabilities.

PiperOrigin-RevId: 272260451
2019-10-01 11:49:49 -07:00
Michael Pratt 53cc72da90 Honor X bit on extra anon pages in PT_LOAD segments
Linux changed this behavior in 16e72e9b30986ee15f17fbb68189ca842c32af58
(v4.11). Previously, extra pages were always mapped RW. Now, those pages will
be executable if the segment specified PF_X. They still must be writeable.

PiperOrigin-RevId: 272256280
2019-10-01 11:30:36 -07:00
Andrei Vagin 7a234f736f splice: try another fallback option only if the previous one isn't supported
Reported-by: syzbot+bb5ed342be51d39b0cbb@syzkaller.appspotmail.com
PiperOrigin-RevId: 272110815
2019-09-30 18:23:42 -07:00
Andrei Vagin 29a1ba54ea splice: compare inode numbers only if both ends are pipes
It isn't allowed to splice data from and into the same pipe.

But right now this check is broken, because we don't check that both ends are
pipes.

PiperOrigin-RevId: 272107022
2019-09-30 17:57:14 -07:00
Adin Scannell 20841b98e1 Update FIXME bug with GitHub issue.
PiperOrigin-RevId: 272101930
2019-09-30 17:24:29 -07:00
Bhasker Hariharan bcbb3ef317 Add a Stringer implementation to PacketDispatchMode
PiperOrigin-RevId: 272083936
2019-09-30 15:52:55 -07:00
Kevin Krakauer c06cca6678 De-flake SetForegroundProcessGroupDifferentSession.
PiperOrigin-RevId: 272059043
2019-09-30 13:59:36 -07:00
Bhasker Hariharan 61f6fbd0ce Fix bugs in PickEphemeralPort for TCP.
Netstack always picks a random start point everytime PickEphemeralPort
is called. While this is required for UDP so that DNS requests go
out through a randomized set of ports it is not required for TCP. Infact
Linux explicitly hashes the (srcip, dstip, dstport) and a one time secret
initialized at start of the application to get a random offset. But to
ensure it doesn't start from the same point on every scan it uses a static
hint that is incremented by 2 in every call to pick ephemeral ports.

The reason for 2 is Linux seems to split the port ranges where active connects
seem to use even ones while odd ones are used by listening sockets.

This CL implements a similar strategy where we use a hash + hint to generate
the offset to start the search for a free Ephemeral port.

This ensures that we cycle through the available port space in order for
repeated connects to the same destination and significantly reduces the
chance of picking a recently released port.

PiperOrigin-RevId: 272058370
2019-09-30 13:55:22 -07:00
Nicolas Lacasse 3ad17ff597 Force timestamps to update when set via InodeOperations.SetTimestamps.
The gofer's CachingInodeOperations implementation contains an optimization for
the common open-read-close pattern when we have a host FD.  In this case, the
host kernel will update the timestamp for us to a reasonably close time, so we
don't need an extra RPC to the gofer.

However, when the app explicitly sets the timestamps (via futimes or similar)
then we actually DO need to update the timestamps, because the host kernel
won't do it for us.

To fix this, a new boolean `forceSetTimestamps` was added to
CachineInodeOperations.SetMaskedAttributes. It is only set by
gofer.InodeOperations.SetTimestamps.

PiperOrigin-RevId: 272048146
2019-09-30 13:08:45 -07:00
Michael Pratt 981fc188f0 Only copy out remaining time on nanosleep success
It looks like the old code attempted to do this, but didn't realize that err !=
nil even in the happy case.

PiperOrigin-RevId: 272005887
2019-09-30 13:07:32 -07:00
Adin Scannell 0c4d080631 Ensure runsc is uploaded.
One would reasonably assume that a field named "regex" would expect
a regular expression. However, in this case, one would be wrong.

The "regex" field actually requires "FileSet" [1] syntax.

?\_(?)_/?

[1] http://ant.apache.org/manual/Types/fileset.html

PiperOrigin-RevId: 271917356
2019-09-29 23:49:34 -07:00
gVisor bot eebc38be7a Merge pull request #882 from DarcySail:darcy_faster_CopyStringIn
PiperOrigin-RevId: 271675009
2019-09-27 17:27:13 -07:00
Adin Scannell c8bb20865d Automated rollback of changelist 256276198
PiperOrigin-RevId: 271665517
2019-09-27 15:58:51 -07:00
Nicolas Lacasse 6a54aa1f14 Bump rules_go to 0.19.5 and Go to 1.13.1.
PiperOrigin-RevId: 271664207
2019-09-27 15:51:33 -07:00
gVisor bot 8539abc0df Merge pull request #864 from tanjianfeng:fix-861
PiperOrigin-RevId: 271649711
2019-09-27 15:18:09 -07:00
gVisor bot abbee5615f Implement SO_BINDTODEVICE sockopt
PiperOrigin-RevId: 271644926
2019-09-27 14:14:04 -07:00
Andrei Vagin 7582385f05 kokoro: don't pass KOKORO_REPO_KEY in presubmit jobs
We don't want to upload packages from the presubmit jobs.

This will fix the error:
[11:01:34][ERROR] Cannot inject environment variables into
                  the build without allowed_env_vars regexes.

PiperOrigin-RevId: 271622996
2019-09-27 12:23:51 -07:00
Andrei Vagin fa15fda6c4 bazel: use rules_pkg from https://github.com/bazelbuild/
BUILD:85:1: in _pkg_deb rule //runsc:runsc-debian: target
'//runsc:runsc-debian' depends on deprecated target
'@bazel_tools//tools/build_defs/pkg:make_deb': The internal version of
make_deb is deprecated. Please use the replacement for pkg_deb from
https://github.com/bazelbuild/rules_pkg/blob/master/pkg.
PiperOrigin-RevId: 271590386
2019-09-27 09:50:18 -07:00
Fabricio Voznika 8337e4f509 Disallow opening of sockets if --fsgofer-host-uds=false
Updates #235

PiperOrigin-RevId: 271475319
2019-09-26 18:16:02 -07:00
Kevin Krakauer 543492650d Make raw socket tests pass in environments with or without CAP_NET_RAW.
PiperOrigin-RevId: 271442321
2019-09-26 15:09:20 -07:00
Andrei Vagin 3221e8372c kokoro: don't force to use python2
https://github.com/bazelbuild/bazel/issues/7899 was fixed
and we don't need this hack anymore.

PiperOrigin-RevId: 271434565
2019-09-26 14:37:19 -07:00
Kenta Tada 69f3c79b24 runsc: add the clone flag of cgroup namespace
Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>
2019-09-26 12:02:01 +09:00
gVisor bot dd0e5eedae Merge pull request #765 from trailofbits:uds_support
PiperOrigin-RevId: 271235134
2019-09-25 16:44:22 -07:00
Fabricio Voznika 129c67d68e Fix runsc log collection in kokoro
PiperOrigin-RevId: 271207152
2019-09-25 14:33:11 -07:00
Kevin Krakauer 59ccbb1044 Remove centralized registration of protocols.
Also removes the need for protocol names.

PiperOrigin-RevId: 271186030
2019-09-25 12:57:05 -07:00
gVisor bot 99c86b8dbd Merge pull request #863 from tanjianfeng:fix-862
PiperOrigin-RevId: 271168948
2019-09-25 11:36:06 -07:00
gVisor bot 76ff1947b6 gvisor: change syscall.RawSyscall to syscall.RawSyscall6 where required
Before https://golang.org/cl/173160 syscall.RawSyscall would zero out
the last three register arguments to the system call. That no longer happens.
For system calls that take more than three arguments, use RawSyscall6 to
ensure that we pass zero, not random data, for the additional arguments.

PiperOrigin-RevId: 271062527
2019-09-24 23:47:42 -07:00
Andrei Vagin 2fb34c8d5c test: don't use designated initializers
This change fixes compile errors:
pty.cc:1460:7: error: expected primary-expression before '.' token
...

PiperOrigin-RevId: 271033729
2019-09-24 19:05:12 -07:00
Robert Tonic 9ebd498a55 Remove unecessary seccomp permission.
This removes the F_DUPFD_CLOEXEC support for the gofer, previously 
required when depending on the STL net package.
2019-09-24 18:37:25 -04:00
Robert Tonic 7810b30983 Refactor command line options and remove the allowed terminology for uds 2019-09-24 18:24:10 -04:00
Adin Scannell 502f8f238e Stub out readahead implementation.
Closes #261

PiperOrigin-RevId: 270973347
2019-09-24 13:29:46 -07:00
Chris Kuiper 6704d625ef Return only primary addresses in Stack.NICInfo()
Non-primary addresses are used for endpoints created to accept multicast and
broadcast packets, as well as "helper" endpoints (0.0.0.0) that allow sending
packets when no proper address has been assigned yet (e.g., for DHCP). These
addresses are not real addresses from a user point of view and should not be
part of the NICInfo() value. Also see b/127321246 for more info.

This switches NICInfo() to call a new NIC.PrimaryAddresses() function. To still
allow an option to get all addresses (mostly for testing) I added
Stack.GetAllAddresses() and NIC.AllAddresses().

In addition, the return value for GetMainNICAddress() was changed for the case
where the NIC has no primary address. Instead of returning an error here,
it now returns an empty AddressWithPrefix() value. The rational for this
change is that it is a valid case for a NIC to have no primary addresses.

Lastly, I refactored the code based on the new additions.

PiperOrigin-RevId: 270971764
2019-09-24 13:21:20 -07:00
gVisor bot 91abeb1dbc Merge pull request #812 from lubinszARM:pr_dup3_arm
PiperOrigin-RevId: 270957224
2019-09-24 12:06:38 -07:00
Tamir Duberstein bbaaa1fcc2 Simplify ICMPRateLimiter
https://github.com/golang/time/commit/c4c64ca added SetBurst upstream.

PiperOrigin-RevId: 270925077
2019-09-24 09:50:51 -07:00
henry.tjf bc9de939fd tty: fix sending SIGTTOU on tty write
How to reproduce:
  $ echo "timeout 10 ls" > foo.sh
  $ chmod +x foo.sh
  $ ./foo.sh
  (will hang here for 10 secs, and the output of ls does not show)

When "ls" process writes to stdout, it receives SIGTTOU signal, and
hangs there. Until "timeout" process timeouts, and kills "ls" process.

The expected result is: "ls" writes its output into tty, and terminates
immdedately, then "timeout" process receives SIGCHLD and terminates.

The reason for this failure is that we missed the check for TOSTOP (if
set, background processes will receive the SIGTTOU signal when they do
write).

We use drivers/tty/n_tty.c:n_tty_write() as a reference.

Fixes: #862

Reported-by: chris.zn <chris.zn@antfin.com>
Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
Signed-off-by: chenglang.hy <chenglang.hy@antfin.com>
2019-09-24 14:18:22 +00:00
Haibo Xu a26276b949 Enable pkg/bits support on arm64.
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: I490716f0e6204f0b3a43f71931b10d1ca541e128
2019-09-24 07:03:19 +00:00
Haibo Xu 2db866c45f Enable pkg/sleep support on arm64.
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: I9071e698c1f222e0fdf3b567ec4cbd97f0a8dde9
2019-09-24 06:42:26 +00:00
Nicolas Lacasse d5b3dd7cb4 Run all runtime tests in a single container.
This makes them run much faster. Also cleaned up the log reporting.

PiperOrigin-RevId: 270799808
2019-09-23 17:43:42 -07:00
Nicolas Lacasse f2ea8e6b24 Always set HOME env var with `runsc exec`.
We already do this for `runsc run`, but need to do the same for `runsc exec`.

PiperOrigin-RevId: 270793459
2019-09-23 17:06:02 -07:00
Adin Scannell 6c88f674af Add test for concurrent reads and writes.
PiperOrigin-RevId: 270789146
2019-09-23 16:44:30 -07:00