Commit Graph

165 Commits

Author SHA1 Message Date
Adin Scannell d29e59af9f Standardize on tools directory.
PiperOrigin-RevId: 291745021
2020-01-27 12:21:00 -08:00
Ian Gudger 27500d529f New sync package.
* Rename syncutil to sync.
* Add aliases to sync types.
* Replace existing usage of standard library sync package.

This will make it easier to swap out synchronization primitives. For example,
this will allow us to use primitives from github.com/sasha-s/go-deadlock to
check for lock ordering violations.

Updates #1472

PiperOrigin-RevId: 289033387
2020-01-09 22:02:24 -08:00
Fabricio Voznika 0d475cdb01 Increase waitForProcessList timeout
It can take more than 10 seconds when running under --race.

PiperOrigin-RevId: 286296060
2019-12-18 17:10:44 -08:00
Andrei Vagin f8c5ad061b runsc/debug: add an option to list all processes
runsc debug --ps list all processes with all threads. This option is added to
the debug command but not to the ps command, because it is going to be used for
debug purposes and we want to add any useful information without thinking about
backward compatibility.

This will help to investigate syzkaller issues.

PiperOrigin-RevId: 285013668
2019-12-11 11:05:41 -08:00
Nicolas Lacasse 663fe840f7 Implement TTY field in control.Processes().
Threadgroups already know their TTY (if they have one), which now contains the
TTY Index, and is returned in the Processes() call.

PiperOrigin-RevId: 284263850
2019-12-06 14:34:13 -08:00
Fabricio Voznika ea7a100202 Make annotations OCI compliant
Changed annotation to follow the standard defined here:
https://github.com/opencontainers/image-spec/blob/master/annotations.md

PiperOrigin-RevId: 284254847
2019-12-06 13:51:38 -08:00
Fabricio Voznika ca90dad0e2 Fix container locking
Sandbox root dir was not being saved with the Container state,
so it would point to the wrong directory location when attempting
to lock the sandbox. This led to race conditions saving and
loading container state. Fixing it, led to multiple deadlocks.

I've moved the saving and locking logic to a separate struct and
moved the lock file inside the RootDir (instead of container
root dir), which allows the lock to be taken inside Destroy,
and removes the need to lock the sandbox.

PiperOrigin-RevId: 277599612
2019-10-30 15:39:04 -07:00
Fabricio Voznika e8ba10c008 Fix early deletion of rootDir
container.startContainers() cannot be called twice in a test
(e.g. TestMultiContainerLoadSandbox) because the cleanup
function deletes the rootDir, together with information from
all other containers that may exist.

PiperOrigin-RevId: 276591806
2019-10-24 16:36:54 -07:00
Tom Lanyon 7e8b5f4a3a Add runsc OCI annotations to support CRI-O.
Obligatory https://xkcd.com/927

Fixes #626
2019-10-20 21:11:01 +11:00
Fabricio Voznika 9fb562234e Fix problem with open FD when copy up is triggered in overlayfs
Linux kernel before 4.19 doesn't implement a feature that updates
open FD after a file is open for write (and is copied to the upper
layer). Already open FD will continue to read the old file content
until they are reopened. This is especially problematic for gVisor
because it caches open files.

Flag was added to force readonly files to be reopenned when the
same file is open for write. This is only needed if using kernels
prior to 4.19.

Closes #1006

It's difficult to really test this because we never run on tests
on older kernels. I'm adding a test in GKE which uses kernels
with the overlayfs problem for 1.14 and lower.

PiperOrigin-RevId: 275115289
2019-10-16 15:06:24 -07:00
Fabricio Voznika b9cdbc26bc Ignore mount options that are not supported in shared mounts
Options that do not change mount behavior inside the Sentry are
irrelevant and should not be used when looking for possible
incompatibilities between master and slave mounts.

PiperOrigin-RevId: 273593486
2019-10-08 13:36:16 -07:00
Fabricio Voznika 0b02c3d5e5 Prevent CAP_NET_RAW from appearing in exec
'docker exec' was getting CAP_NET_RAW even when --net-raw=false
because it was not filtered out from when copying container's
capabilities.

PiperOrigin-RevId: 272260451
2019-10-01 11:49:49 -07:00
Fabricio Voznika 010b093258 Bring back to life features lost in recent refactor
- Sandbox logs are generated when running tests
- Kokoro uploads the sandbox logs
- Supports multiple parallel runs
- Revive script to install locally built runsc with docker

PiperOrigin-RevId: 269337274
2019-09-16 08:17:00 -07:00
Ian Lewis 0bfffbcb01 Ignore the root container when calculating oom_score_adj for the sandbox.
This is done because the root container for CRI is the infrastructure (pause)
container and always gets a low oom_score_adj. We do this to ensure that only
the oom_score_adj of user containers is used to calculated the sandbox
oom_score_adj.

Implemented in runsc rather than the containerd shim as it's a bit cleaner to
implement here (in the shim it would require overwriting the oomScoreAdj and
re-writing out the config.json again). This processing is Kubernetes(CRI)
specific but we are currently only supporting CRI for multi-container support
anyway.

PiperOrigin-RevId: 267507706
2019-09-05 19:21:25 -07:00
Fabricio Voznika 0f5cdc1e00 Resolve flakes with TestMultiContainerDestroy
Some processes are reparented to the root container depending
on the kill order and the root container would not reap in time.
So some zombie processes were still present when the test checked.

Fix it by running the second container inside a PID namespace.

PiperOrigin-RevId: 267278591
2019-09-04 18:56:49 -07:00
Adin Scannell 67a2ab1438 Impose order on test scripts.
The simple test script has gotten out of control. Shard this script into
different pieces and attempt to impose order on overall test structure. This
change helps lay some of the foundations for future improvements.

 * The runsc/test directories are moved into just test/.
 * The runsc/test/testutil package is split into logical pieces.
 * The scripts/ directory contains new top-level targets.
 * Each test is now responsible for building targets it requires.
 * The install functionality is moved into `runsc` itself for simplicity.
 * The existing kokoro run_tests.sh file now just calls all (can be split).

After this change is merged,  I will create multiple distinct workflows for
Kokoro, one for each of the scripts currently targeted by `run_tests.sh` today,
which should dramatically reduce the time-to-run for the Kokoro tests, and
provides a better foundation for further improvements to the infrastructure.

PiperOrigin-RevId: 267081397
2019-09-03 22:02:43 -07:00
Fabricio Voznika c39564332b Mount volumes as super user
This used to be the case, but regressed after a recent change.
Also made a few fixes around it and clean up the code a bit.

Closes #720

PiperOrigin-RevId: 265717496
2019-08-27 10:47:16 -07:00
Fabricio Voznika 79cc4397fd Set gofer's OOM score adjustment
Updates #512

PiperOrigin-RevId: 262195448
2019-08-07 12:55:06 -07:00
Fabricio Voznika e70eafc9e5 Make loading container in a sandbox more robust
PiperOrigin-RevId: 262071646
2019-08-06 23:26:46 -07:00
Fabricio Voznika b461be88a8 Stops container if gofer is killed
Each gofer now has a goroutine that polls on the FDs used
to communicate with the sandbox. The respective gofer is
destroyed if any of the FDs is closed.

Closes #601

PiperOrigin-RevId: 261383725
2019-08-02 13:47:55 -07:00
Ian Lewis 3eff0531ad Set sandbox oom_score_adj
Set /proc/self/oom_score_adj based on oomScoreAdj specified in the OCI bundle.
When new containers are added to the sandbox oom_score_adj for the sandbox and
all other gofers are adjusted so that oom_score_adj is equal to the lowest
oom_score_adj of all containers in the sandbox.

Fixes #512

PiperOrigin-RevId: 261242725
2019-08-01 18:49:21 -07:00
chris.zn 1c5b6d9bd2 Use different pidns among different containers
The different containers in a sandbox used only one pid
namespace before. This results in that a container can see
the processes in another container in the same sandbox.

This patch use different pid namespace for different containers.

Signed-off-by: chris.zn <chris.zn@antfin.com>
2019-07-24 13:38:23 +08:00
Nicolas Lacasse 04cbb13ce9 Give each container a distinct MountNamespace.
This keeps all container filesystem completely separate from eachother
(including from the root container filesystem), and allows us to get rid of the
"__runsc_containers__" directory.

It also simplifies container startup/teardown as we don't have to muck around
in the root container's filesystem.

PiperOrigin-RevId: 259613346
2019-07-23 14:37:07 -07:00
Nicolas Lacasse 659bebab8e Don't try to execute a file that is not regular.
PiperOrigin-RevId: 257037608
2019-07-08 12:56:48 -07:00
Andrei Vagin 67f2cefce0 Avoid importing platforms from many source files
PiperOrigin-RevId: 256494243
2019-07-03 22:51:26 -07:00
Michael Pratt 5b41ba5d0e Fix various spelling issues in the documentation
Addresses obvious typos, in the documentation only.

COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/443 from Pixep:fix/documentation-spelling 4d0688164eafaf0b3010e5f4824b35d1e7176d65
PiperOrigin-RevId: 255477779
2019-06-27 14:25:50 -07:00
Fabricio Voznika 0e07c94d54 Kill sandbox process when 'runsc do' exits
PiperOrigin-RevId: 253882115
2019-06-18 15:36:17 -07:00
Fabricio Voznika bdb19b82ef Add Container/Sandbox args struct for creation
There were 3 string arguments that could be easily misplaced
and it makes it easier to add new arguments, especially for
Container that has dozens of callers.

PiperOrigin-RevId: 253872074
2019-06-18 14:46:49 -07:00
Adin Scannell add40fd6ad Update canonical repository.
This can be merged after:
https://github.com/google/gvisor-website/pull/77
  or
https://github.com/google/gvisor-website/pull/78

PiperOrigin-RevId: 253132620
2019-06-13 16:50:15 -07:00
Fabricio Voznika 356d1be140 Allow 'runsc do' to run without root
'--rootless' flag lets a non-root user execute 'runsc do'.
The drawback is that the sandbox and gofer processes will
run as root inside a user namespace that is mapped to the
caller's user, intead of nobody. And network is defaulted
to '--network=host' inside the root network namespace. On
the bright side, it's very convenient for testing:

runsc --rootless do ls
runsc --rootless do curl www.google.com

PiperOrigin-RevId: 252840970
2019-06-12 09:41:50 -07:00
Fabricio Voznika fc746efa9a Add support to mount pod shared tmpfs mounts
Parse annotations containing 'gvisor.dev/spec/mount' that gives
hints about how mounts are shared between containers inside a
pod. This information can be used to better inform how to mount
these volumes inside gVisor. For example, a volume that is shared
between containers inside a pod can be bind mounted inside the
sandbox, instead of being two independent mounts.

For now, this information is used to allow the same tmpfs mounts
to be shared between containers which wasn't possible before.

PiperOrigin-RevId: 252704037
2019-06-11 14:54:31 -07:00
Fabricio Voznika d28f71adcf Remove 'clearStatus' option from container.Wait*PID()
clearStatus was added to allow detached execution to wait
on the exec'd process and retrieve its exit status. However,
it's not currently used. Both docker and gvisor-containerd-shim
wait on the "shim" process and retrieve the exit status from
there. We could change gvisor-containerd-shim to use waits, but
it will end up also consuming a process for the wait, which is
similar to having the shim process.

Closes #234

PiperOrigin-RevId: 251349490
2019-06-03 18:16:09 -07:00
Andrei Vagin bf0ac565d2 Fix runsc restore to be compatible with docker start --checkpoint ...
Change-Id: I02b30de13f1393df66edf8829fedbf32405d18f8
PiperOrigin-RevId: 246621192
2019-05-03 21:41:45 -07:00
Andrei Vagin c967fbdaa2 runsc: move test_app in a separate directory
Opensource tools (e. g. https://github.com/fatih/vim-go) can't hanlde more than
one golang package in one directory.

PiperOrigin-RevId: 246435962
Change-Id: I67487915e3838762424b2d168efc54ae34fb801f
2019-05-02 19:27:27 -07:00
Fabricio Voznika bbb6539114 Add [simple] network support to 'runsc do'
Sandbox always runsc with IP 192.168.10.2 and the peer
network adds 1 to the address (192.168.10.3). Sandbox
IP can be changed using --ip flag.

Here a few examples:
  sudo runsc do curl www.google.com
  sudo runsc do --ip=10.10.10.2 bash -c "echo 123 | netcat -l -p 8080"

PiperOrigin-RevId: 246421277
Change-Id: I7b3dce4af46a57300350dab41cb27e04e4b6e9da
2019-05-02 17:17:39 -07:00
Michael Pratt 4d52a55201 Change copyright notice to "The gVisor Authors"
Based on the guidelines at
https://opensource.google.com/docs/releasing/authors/.

1. $ rg -l "Google LLC" | xargs sed -i 's/Google LLC.*/The gVisor Authors./'
2. Manual fixup of "Google Inc" references.
3. Add AUTHORS file. Authors may request to be added to this file.
4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS.

Fixes #209

PiperOrigin-RevId: 245823212
Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
2019-04-29 14:26:23 -07:00
Nicolas Lacasse f4ce43e1f4 Allow and document bug ids in gVisor codebase.
PiperOrigin-RevId: 245818639
Change-Id: I03703ef0fb9b6675955637b9fe2776204c545789
2019-04-29 14:04:14 -07:00
Kevin Krakauer df21460cfd Fix container_test flakes.
Create, Start, and Destroy were racing to create and destroy the
metadata directory of containers.

This is a re-upload of
https://gvisor-review.googlesource.com/c/gvisor/+/16260, but with the
correct account.

Change-Id: I16b7a9d0971f0df873e7f4145e6ac8f72730a4f1
PiperOrigin-RevId: 244892991
2019-04-23 11:33:40 -07:00
Andrei Vagin 93b3c9b76c runsc: set UID and GID if gofer is executed in a new user namespace
Otherwise, we will not have capabilities in the user namespace.

And this patch adds the noexec option for mounts.

https://github.com/google/gvisor/issues/145

PiperOrigin-RevId: 242706519
Change-Id: I1b78b77d6969bd18038c71616e8eb7111b71207c
2019-04-09 11:31:57 -07:00
Nicolas Lacasse dcf6613331 Set container.CreatedAt in Create().
PiperOrigin-RevId: 241056805
Change-Id: I13ea8f5dbfb01ca02a3b0ab887b8c3bdf4d556a6
2019-03-29 14:55:22 -07:00
Fabricio Voznika e420cc3e5d Add support for mount propagation
Properly handle propagation options for root and mounts. Now usage of
mount options shared, rshared, and noexec cause error to start. shared/
rshared breaks sandbox=>host isolation. slave however can be supported
because changes propagate from host to sandbox.

Root FS setup moved inside the gofer. Apart from simplifying the code,
it keeps all mounts inside the namespace. And they are torn down when
the namespace is destroyed (DestroyFS is no longer needed).

PiperOrigin-RevId: 239037661
Change-Id: I8b5ee4d50da33c042ea34fa68e56514ebe20e6e0
2019-03-18 12:30:43 -07:00
Fabricio Voznika 52a2abfca4 Fix cgroup when path is relative
This can happen when 'docker run --cgroup-parent=' flag is set.

PiperOrigin-RevId: 235645559
Change-Id: Ieea3ae66939abadab621053551bf7d62d412e7ee
2019-02-25 19:21:47 -08:00
Andrei Vagin 4e695adcd0 gvisor/gofer: Use pivot_root instead of chroot
PiperOrigin-RevId: 231864273
Change-Id: I8545b72b615f5c2945df374b801b80be64ec3e13
2019-01-31 15:19:04 -08:00
Michael Pratt 2a0c69b19f Remove license comments
Nothing reads them and they can simply get stale.

Generated with:
$ sed -i "s/licenses(\(.*\)).*/licenses(\1)/" **/BUILD

PiperOrigin-RevId: 231818945
Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0
2019-01-31 11:12:53 -08:00
Lantao Liu 52b3cd873d runsc: Only uninstall cgroup for sandbox stop.
PiperOrigin-RevId: 231263114
Change-Id: I57467a34fe94e395fdd3685462c4fe9776d040a3
2019-01-28 11:58:25 -08:00
Fabricio Voznika 55e8eb775b Make cacheRemoteRevalidating detect changes to file size
When file size changes outside the sandbox, page cache was not
refreshing file size which is required for cacheRemoteRevalidating.
In fact, cacheRemoteRevalidating should be skipping the cache
completely since it's not really benefiting from it. The cache is
cache is already bypassed for unstable attributes (see
cachePolicy.cacheUAttrs). And althought the cache is called to
map pages, they will always miss the cache and map directly from
the host.

Created a HostMappable struct that maps directly to the host and
use it for files with cacheRemoteRevalidating.

Closes #124

PiperOrigin-RevId: 230998440
Change-Id: Ic5f632eabe33b47241e05e98c95e9b2090ae08fc
2019-01-25 17:23:07 -08:00
ShiruRen c6facd0358 Fix a nil pointer dereference bug in Container.Destroy()
In Container.Destroy(), we call c.stop() before calling
executeHooksBestEffort(), therefore, when we call
executeHooksBestEffort(c.Spec.Hooks.Poststop, c.State()) to execute
the poststop hook, it results in a nil pointer dereference since it
reads c.Sandbox.Pid in c.State() after the sandbox has been destroyed.
To fix this bug, we can change container's status to "stopped" before
executing the poststop hook.

Signed-off-by: ShiruRen <renshiru2000@gmail.com>
Change-Id: I4d835e430066fab7e599e188f945291adfc521ef
PiperOrigin-RevId: 230975505
2019-01-25 15:03:17 -08:00
Fabricio Voznika c28f886c0b Execute statically linked binary
Mounting lib and lib64 are not necessary anymore and simplifies the test.

PiperOrigin-RevId: 230971195
Change-Id: Ib91a3ffcec4b322cd3687c337eedbde9641685ed
2019-01-25 14:39:20 -08:00
Andrei Vagin 5f08f8fd81 Don't bind-mount runsc into a sandbox mntns
PiperOrigin-RevId: 230437407
Change-Id: Id9d8ceeb018aad2fe317407c78c6ee0f4b47aa2b
2019-01-22 16:46:42 -08:00
Fabricio Voznika c1be25b78d Scrub runsc error messages
Removed "error" and "failed to" prefix that don't add value
from messages. Adjusted a few other messages.  In particular,
when the container fail to start, the message returned is easier
for humans to read:

$ docker run --rm --runtime=runsc alpine foobar
docker: Error response from daemon: OCI runtime start failed: <path> did not terminate sucessfully: starting container: starting root container [foobar]: starting sandbox: searching for executable "foobar", cwd: "/", $PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin": no such file or directory

Closes #77

PiperOrigin-RevId: 230022798
Change-Id: I83339017c70dae09e4f9f8e0ea2e554c4d5d5cd1
2019-01-18 17:36:02 -08:00