gvisor

Commit Graph

Author	SHA1	Message	Date
Fabricio Voznika	ca90dad0e2	Fix container locking Sandbox root dir was not being saved with the Container state, so it would point to the wrong directory location when attempting to lock the sandbox. This led to race conditions saving and loading container state. Fixing it, led to multiple deadlocks. I've moved the saving and locking logic to a separate struct and moved the lock file inside the RootDir (instead of container root dir), which allows the lock to be taken inside Destroy, and removes the need to lock the sandbox. PiperOrigin-RevId: 277599612	2019-10-30 15:39:04 -07:00
Fabricio Voznika	e8ba10c008	Fix early deletion of rootDir container.startContainers() cannot be called twice in a test (e.g. TestMultiContainerLoadSandbox) because the cleanup function deletes the rootDir, together with information from all other containers that may exist. PiperOrigin-RevId: 276591806	2019-10-24 16:36:54 -07:00
Tom Lanyon	7e8b5f4a3a	Add runsc OCI annotations to support CRI-O. Obligatory https://xkcd.com/927 Fixes #626	2019-10-20 21:11:01 +11:00
Fabricio Voznika	9fb562234e	Fix problem with open FD when copy up is triggered in overlayfs Linux kernel before 4.19 doesn't implement a feature that updates open FD after a file is open for write (and is copied to the upper layer). Already open FD will continue to read the old file content until they are reopened. This is especially problematic for gVisor because it caches open files. Flag was added to force readonly files to be reopenned when the same file is open for write. This is only needed if using kernels prior to 4.19. Closes #1006 It's difficult to really test this because we never run on tests on older kernels. I'm adding a test in GKE which uses kernels with the overlayfs problem for 1.14 and lower. PiperOrigin-RevId: 275115289	2019-10-16 15:06:24 -07:00
Fabricio Voznika	b9cdbc26bc	Ignore mount options that are not supported in shared mounts Options that do not change mount behavior inside the Sentry are irrelevant and should not be used when looking for possible incompatibilities between master and slave mounts. PiperOrigin-RevId: 273593486	2019-10-08 13:36:16 -07:00
Fabricio Voznika	0b02c3d5e5	Prevent CAP_NET_RAW from appearing in exec 'docker exec' was getting CAP_NET_RAW even when --net-raw=false because it was not filtered out from when copying container's capabilities. PiperOrigin-RevId: 272260451	2019-10-01 11:49:49 -07:00
Fabricio Voznika	010b093258	Bring back to life features lost in recent refactor - Sandbox logs are generated when running tests - Kokoro uploads the sandbox logs - Supports multiple parallel runs - Revive script to install locally built runsc with docker PiperOrigin-RevId: 269337274	2019-09-16 08:17:00 -07:00
Ian Lewis	0bfffbcb01	Ignore the root container when calculating oom_score_adj for the sandbox. This is done because the root container for CRI is the infrastructure (pause) container and always gets a low oom_score_adj. We do this to ensure that only the oom_score_adj of user containers is used to calculated the sandbox oom_score_adj. Implemented in runsc rather than the containerd shim as it's a bit cleaner to implement here (in the shim it would require overwriting the oomScoreAdj and re-writing out the config.json again). This processing is Kubernetes(CRI) specific but we are currently only supporting CRI for multi-container support anyway. PiperOrigin-RevId: 267507706	2019-09-05 19:21:25 -07:00
Fabricio Voznika	0f5cdc1e00	Resolve flakes with TestMultiContainerDestroy Some processes are reparented to the root container depending on the kill order and the root container would not reap in time. So some zombie processes were still present when the test checked. Fix it by running the second container inside a PID namespace. PiperOrigin-RevId: 267278591	2019-09-04 18:56:49 -07:00
Adin Scannell	67a2ab1438	Impose order on test scripts. The simple test script has gotten out of control. Shard this script into different pieces and attempt to impose order on overall test structure. This change helps lay some of the foundations for future improvements. * The runsc/test directories are moved into just test/. * The runsc/test/testutil package is split into logical pieces. * The scripts/ directory contains new top-level targets. * Each test is now responsible for building targets it requires. * The install functionality is moved into `runsc` itself for simplicity. * The existing kokoro run_tests.sh file now just calls all (can be split). After this change is merged, I will create multiple distinct workflows for Kokoro, one for each of the scripts currently targeted by `run_tests.sh` today, which should dramatically reduce the time-to-run for the Kokoro tests, and provides a better foundation for further improvements to the infrastructure. PiperOrigin-RevId: 267081397	2019-09-03 22:02:43 -07:00
Fabricio Voznika	c39564332b	Mount volumes as super user This used to be the case, but regressed after a recent change. Also made a few fixes around it and clean up the code a bit. Closes #720 PiperOrigin-RevId: 265717496	2019-08-27 10:47:16 -07:00
Fabricio Voznika	79cc4397fd	Set gofer's OOM score adjustment Updates #512 PiperOrigin-RevId: 262195448	2019-08-07 12:55:06 -07:00
Fabricio Voznika	e70eafc9e5	Make loading container in a sandbox more robust PiperOrigin-RevId: 262071646	2019-08-06 23:26:46 -07:00
Fabricio Voznika	b461be88a8	Stops container if gofer is killed Each gofer now has a goroutine that polls on the FDs used to communicate with the sandbox. The respective gofer is destroyed if any of the FDs is closed. Closes #601 PiperOrigin-RevId: 261383725	2019-08-02 13:47:55 -07:00
Ian Lewis	3eff0531ad	Set sandbox oom_score_adj Set /proc/self/oom_score_adj based on oomScoreAdj specified in the OCI bundle. When new containers are added to the sandbox oom_score_adj for the sandbox and all other gofers are adjusted so that oom_score_adj is equal to the lowest oom_score_adj of all containers in the sandbox. Fixes #512 PiperOrigin-RevId: 261242725	2019-08-01 18:49:21 -07:00
chris.zn	1c5b6d9bd2	Use different pidns among different containers The different containers in a sandbox used only one pid namespace before. This results in that a container can see the processes in another container in the same sandbox. This patch use different pid namespace for different containers. Signed-off-by: chris.zn <chris.zn@antfin.com>	2019-07-24 13:38:23 +08:00
Nicolas Lacasse	04cbb13ce9	Give each container a distinct MountNamespace. This keeps all container filesystem completely separate from eachother (including from the root container filesystem), and allows us to get rid of the "__runsc_containers__" directory. It also simplifies container startup/teardown as we don't have to muck around in the root container's filesystem. PiperOrigin-RevId: 259613346	2019-07-23 14:37:07 -07:00
Nicolas Lacasse	659bebab8e	Don't try to execute a file that is not regular. PiperOrigin-RevId: 257037608	2019-07-08 12:56:48 -07:00
Andrei Vagin	67f2cefce0	Avoid importing platforms from many source files PiperOrigin-RevId: 256494243	2019-07-03 22:51:26 -07:00
Michael Pratt	5b41ba5d0e	Fix various spelling issues in the documentation Addresses obvious typos, in the documentation only. COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/443 from Pixep:fix/documentation-spelling 4d0688164eafaf0b3010e5f4824b35d1e7176d65 PiperOrigin-RevId: 255477779	2019-06-27 14:25:50 -07:00
Fabricio Voznika	0e07c94d54	Kill sandbox process when 'runsc do' exits PiperOrigin-RevId: 253882115	2019-06-18 15:36:17 -07:00
Fabricio Voznika	bdb19b82ef	Add Container/Sandbox args struct for creation There were 3 string arguments that could be easily misplaced and it makes it easier to add new arguments, especially for Container that has dozens of callers. PiperOrigin-RevId: 253872074	2019-06-18 14:46:49 -07:00
Adin Scannell	add40fd6ad	Update canonical repository. This can be merged after: https://github.com/google/gvisor-website/pull/77 or https://github.com/google/gvisor-website/pull/78 PiperOrigin-RevId: 253132620	2019-06-13 16:50:15 -07:00
Fabricio Voznika	356d1be140	Allow 'runsc do' to run without root '--rootless' flag lets a non-root user execute 'runsc do'. The drawback is that the sandbox and gofer processes will run as root inside a user namespace that is mapped to the caller's user, intead of nobody. And network is defaulted to '--network=host' inside the root network namespace. On the bright side, it's very convenient for testing: runsc --rootless do ls runsc --rootless do curl www.google.com PiperOrigin-RevId: 252840970	2019-06-12 09:41:50 -07:00
Fabricio Voznika	fc746efa9a	Add support to mount pod shared tmpfs mounts Parse annotations containing 'gvisor.dev/spec/mount' that gives hints about how mounts are shared between containers inside a pod. This information can be used to better inform how to mount these volumes inside gVisor. For example, a volume that is shared between containers inside a pod can be bind mounted inside the sandbox, instead of being two independent mounts. For now, this information is used to allow the same tmpfs mounts to be shared between containers which wasn't possible before. PiperOrigin-RevId: 252704037	2019-06-11 14:54:31 -07:00
Fabricio Voznika	d28f71adcf	Remove 'clearStatus' option from container.Wait*PID() clearStatus was added to allow detached execution to wait on the exec'd process and retrieve its exit status. However, it's not currently used. Both docker and gvisor-containerd-shim wait on the "shim" process and retrieve the exit status from there. We could change gvisor-containerd-shim to use waits, but it will end up also consuming a process for the wait, which is similar to having the shim process. Closes #234 PiperOrigin-RevId: 251349490	2019-06-03 18:16:09 -07:00
Andrei Vagin	bf0ac565d2	Fix runsc restore to be compatible with docker start --checkpoint ... Change-Id: I02b30de13f1393df66edf8829fedbf32405d18f8 PiperOrigin-RevId: 246621192	2019-05-03 21:41:45 -07:00
Andrei Vagin	c967fbdaa2	runsc: move test_app in a separate directory Opensource tools (e. g. https://github.com/fatih/vim-go) can't hanlde more than one golang package in one directory. PiperOrigin-RevId: 246435962 Change-Id: I67487915e3838762424b2d168efc54ae34fb801f	2019-05-02 19:27:27 -07:00
Fabricio Voznika	bbb6539114	Add [simple] network support to 'runsc do' Sandbox always runsc with IP 192.168.10.2 and the peer network adds 1 to the address (192.168.10.3). Sandbox IP can be changed using --ip flag. Here a few examples: sudo runsc do curl www.google.com sudo runsc do --ip=10.10.10.2 bash -c "echo 123 \| netcat -l -p 8080" PiperOrigin-RevId: 246421277 Change-Id: I7b3dce4af46a57300350dab41cb27e04e4b6e9da	2019-05-02 17:17:39 -07:00
Michael Pratt	4d52a55201	Change copyright notice to "The gVisor Authors" Based on the guidelines at https://opensource.google.com/docs/releasing/authors/. 1. $ rg -l "Google LLC" \| xargs sed -i 's/Google LLC.*/The gVisor Authors./' 2. Manual fixup of "Google Inc" references. 3. Add AUTHORS file. Authors may request to be added to this file. 4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS. Fixes #209 PiperOrigin-RevId: 245823212 Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9	2019-04-29 14:26:23 -07:00
Nicolas Lacasse	f4ce43e1f4	Allow and document bug ids in gVisor codebase. PiperOrigin-RevId: 245818639 Change-Id: I03703ef0fb9b6675955637b9fe2776204c545789	2019-04-29 14:04:14 -07:00
Kevin Krakauer	df21460cfd	Fix container_test flakes. Create, Start, and Destroy were racing to create and destroy the metadata directory of containers. This is a re-upload of https://gvisor-review.googlesource.com/c/gvisor/+/16260, but with the correct account. Change-Id: I16b7a9d0971f0df873e7f4145e6ac8f72730a4f1 PiperOrigin-RevId: 244892991	2019-04-23 11:33:40 -07:00
Andrei Vagin	93b3c9b76c	runsc: set UID and GID if gofer is executed in a new user namespace Otherwise, we will not have capabilities in the user namespace. And this patch adds the noexec option for mounts. https://github.com/google/gvisor/issues/145 PiperOrigin-RevId: 242706519 Change-Id: I1b78b77d6969bd18038c71616e8eb7111b71207c	2019-04-09 11:31:57 -07:00
Nicolas Lacasse	dcf6613331	Set container.CreatedAt in Create(). PiperOrigin-RevId: 241056805 Change-Id: I13ea8f5dbfb01ca02a3b0ab887b8c3bdf4d556a6	2019-03-29 14:55:22 -07:00
Fabricio Voznika	e420cc3e5d	Add support for mount propagation Properly handle propagation options for root and mounts. Now usage of mount options shared, rshared, and noexec cause error to start. shared/ rshared breaks sandbox=>host isolation. slave however can be supported because changes propagate from host to sandbox. Root FS setup moved inside the gofer. Apart from simplifying the code, it keeps all mounts inside the namespace. And they are torn down when the namespace is destroyed (DestroyFS is no longer needed). PiperOrigin-RevId: 239037661 Change-Id: I8b5ee4d50da33c042ea34fa68e56514ebe20e6e0	2019-03-18 12:30:43 -07:00
Fabricio Voznika	52a2abfca4	Fix cgroup when path is relative This can happen when 'docker run --cgroup-parent=' flag is set. PiperOrigin-RevId: 235645559 Change-Id: Ieea3ae66939abadab621053551bf7d62d412e7ee	2019-02-25 19:21:47 -08:00
Andrei Vagin	4e695adcd0	gvisor/gofer: Use pivot_root instead of chroot PiperOrigin-RevId: 231864273 Change-Id: I8545b72b615f5c2945df374b801b80be64ec3e13	2019-01-31 15:19:04 -08:00
Michael Pratt	2a0c69b19f	Remove license comments Nothing reads them and they can simply get stale. Generated with: $ sed -i "s/licenses($.$)./licenses(\1)/" **/BUILD PiperOrigin-RevId: 231818945 Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0	2019-01-31 11:12:53 -08:00
Lantao Liu	52b3cd873d	runsc: Only uninstall cgroup for sandbox stop. PiperOrigin-RevId: 231263114 Change-Id: I57467a34fe94e395fdd3685462c4fe9776d040a3	2019-01-28 11:58:25 -08:00
Fabricio Voznika	55e8eb775b	Make cacheRemoteRevalidating detect changes to file size When file size changes outside the sandbox, page cache was not refreshing file size which is required for cacheRemoteRevalidating. In fact, cacheRemoteRevalidating should be skipping the cache completely since it's not really benefiting from it. The cache is cache is already bypassed for unstable attributes (see cachePolicy.cacheUAttrs). And althought the cache is called to map pages, they will always miss the cache and map directly from the host. Created a HostMappable struct that maps directly to the host and use it for files with cacheRemoteRevalidating. Closes #124 PiperOrigin-RevId: 230998440 Change-Id: Ic5f632eabe33b47241e05e98c95e9b2090ae08fc	2019-01-25 17:23:07 -08:00
ShiruRen	c6facd0358	Fix a nil pointer dereference bug in Container.Destroy() In Container.Destroy(), we call c.stop() before calling executeHooksBestEffort(), therefore, when we call executeHooksBestEffort(c.Spec.Hooks.Poststop, c.State()) to execute the poststop hook, it results in a nil pointer dereference since it reads c.Sandbox.Pid in c.State() after the sandbox has been destroyed. To fix this bug, we can change container's status to "stopped" before executing the poststop hook. Signed-off-by: ShiruRen <renshiru2000@gmail.com> Change-Id: I4d835e430066fab7e599e188f945291adfc521ef PiperOrigin-RevId: 230975505	2019-01-25 15:03:17 -08:00
Fabricio Voznika	c28f886c0b	Execute statically linked binary Mounting lib and lib64 are not necessary anymore and simplifies the test. PiperOrigin-RevId: 230971195 Change-Id: Ib91a3ffcec4b322cd3687c337eedbde9641685ed	2019-01-25 14:39:20 -08:00
Andrei Vagin	5f08f8fd81	Don't bind-mount runsc into a sandbox mntns PiperOrigin-RevId: 230437407 Change-Id: Id9d8ceeb018aad2fe317407c78c6ee0f4b47aa2b	2019-01-22 16:46:42 -08:00
Fabricio Voznika	c1be25b78d	Scrub runsc error messages Removed "error" and "failed to" prefix that don't add value from messages. Adjusted a few other messages. In particular, when the container fail to start, the message returned is easier for humans to read: $ docker run --rm --runtime=runsc alpine foobar docker: Error response from daemon: OCI runtime start failed: <path> did not terminate sucessfully: starting container: starting root container [foobar]: starting sandbox: searching for executable "foobar", cwd: "/", $PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin": no such file or directory Closes #77 PiperOrigin-RevId: 230022798 Change-Id: I83339017c70dae09e4f9f8e0ea2e554c4d5d5cd1	2019-01-18 17:36:02 -08:00
Andrei Vagin	c0a981629c	Start a sandbox process in a new userns only if CAP_SETUID is set In addition, it fixes a race condition in TestMultiContainerGoferStop. There are two scripts copy the same set of files into the same directory and sometime one of this command fails with EXIST. PiperOrigin-RevId: 230011247 Change-Id: I9289f72e65dc407cdcd0e6cd632a509e01f43e9c	2019-01-18 16:08:39 -08:00
Fabricio Voznika	e4d3ca7263	Prevent internal tmpfs mount to override files in /tmp Runsc wants to mount /tmp using internal tmpfs implementation for performance. However, it risks hiding files that may exist under /tmp in case it's present in the container. Now, it only mounts over /tmp iff: - /tmp was not explicitly asked to be mounted - /tmp is empty If any of this is not true, then /tmp maps to the container's image /tmp. Note: checkpoint doesn't have sentry FS mounted to check if /tmp is empty. It simply looks for explicit mounts right now. PiperOrigin-RevId: 229607856 Change-Id: I10b6dae7ac157ef578efc4dfceb089f3b94cde06	2019-01-16 12:48:32 -08:00
Fabricio Voznika	92cf3764e0	Create working directory if it doesn't yet exist PiperOrigin-RevId: 229438125 Change-Id: I58eb0d10178d1adfc709d7b859189d1acbcb2f22	2019-01-15 14:13:27 -08:00
Andrei Vagin	f8c8f24154	runsc: Collect zombies of sandbox and gofer processes And we need to wait a gofer process before cgroup.Uninstall, because it is running in the sandbox cgroups. PiperOrigin-RevId: 228904020 Change-Id: Iaf8826d5b9626db32d4057a1c505a8d7daaeb8f9	2019-01-11 10:32:26 -08:00
Fabricio Voznika	0d7023d581	Restore to original cgroup after sandbox and gofer processes are created The original code assumed that it was safe to join and not restore cgroup, but Container.Run will not exit after calling start, making cgroup cleanup fail because there were still processes inside the cgroup. PiperOrigin-RevId: 228529199 Change-Id: I12a48d9adab4bbb02f20d71ec99598c336cbfe51	2019-01-09 09:18:15 -08:00
Nicolas Lacasse	1775a0e11e	container.Destroy should clean up container metadata even if other cleanups fail If the sandbox process is dead (because of a panic or some other problem), container.Destroy will never remove the container metadata file, since it will always fail when calling container.stop(). This CL changes container.Destroy() to always perform the three necessary cleanup operations: * Stop the sandbox and gofer processes. * Remove the container fs on the host. * Delete the container metadata directory. Errors from these three operations will be concatenated and returned from Destroy(). PiperOrigin-RevId: 225448164 Change-Id: I99c6311b2e4fe5f6e2ca991424edf1ebeae9df32	2018-12-13 15:38:10 -08:00

1 2 3 4

159 Commits