gvisor

Commit Graph

Author	SHA1	Message	Date
Fabricio Voznika	71589b7f7e	Let flags be overriden from OCI annotations This allows runsc flags to be set per sandbox instance. For example, K8s pod annotations can be used to enable --debug for a single pod, making troubleshoot much easier. Similarly, features like --vfs2 can be enabled for experimentation without affecting other pods in the node. Closes #3494 PiperOrigin-RevId: 329542815	2020-09-01 11:12:19 -07:00
gVisor bot	c81ac8ec3b	Merge pull request #2672 from amscanne:shim-integrated PiperOrigin-RevId: 321053634	2020-07-13 16:10:58 -07:00
Fabricio Voznika	bae1475603	Print spec as json when --debug is enabled The previous format skipped many important structs that are pointers, especially for cgroups. Change to print as json, removing parts of the spec that are not relevant. Also removed debug message from gofer that can be very noisy when directories are large. PiperOrigin-RevId: 316713267	2020-06-16 10:52:17 -07:00
Fabricio Voznika	ca5912d13c	More runsc changes for VFS2 - Add /tmp handling - Apply mount options - Enable more container_test tests - Forward signals to child process when test respaws process to run as root inside namespace. Updates #1487 PiperOrigin-RevId: 314263281	2020-06-01 21:32:09 -07:00
Fabricio Voznika	f7418e2159	Move Cleanup to its own package PiperOrigin-RevId: 313663382	2020-05-28 14:49:06 -07:00
Fabricio Voznika	a8c1b32660	Automated rollback of changelist 309082540 PiperOrigin-RevId: 313636920	2020-05-28 12:25:57 -07:00
moricho	fc53d64367	refactor and add test for bindmount Signed-off-by: moricho <ikeda.morito@gmail.com>	2020-04-26 17:24:34 +09:00
Ian Lewis	daf3322498	Add logging message for noNewPrivileges OCI option. noNewPrivileges is ignored if set to false since gVisor assumes that PR_SET_NO_NEW_PRIVS is always enabled. PiperOrigin-RevId: 305991947	2020-04-10 20:32:23 -07:00
Ian Lewis	56054fc1fb	Add friendlier messages for frequently encountered errors. Issue #2270 Issue #1765 PiperOrigin-RevId: 305385436	2020-04-07 18:51:01 -07:00
Fabricio Voznika	f2e4b5ab93	Kill sandbox process when parent process terminates When the sandbox runs in attached more, e.g. runsc do, runsc run, the sandbox lifetime is controlled by the parent process. This wasn't working in all cases because PR_GET_PDEATHSIG doesn't propagate through execve when the process changes uid/gid. So it was getting dropped when the sandbox execve's to change to user nobody. PiperOrigin-RevId: 300601247	2020-03-12 12:32:26 -07:00
Adin Scannell	d29e59af9f	Standardize on tools directory. PiperOrigin-RevId: 291745021	2020-01-27 12:21:00 -08:00
gVisor bot	6122b413f1	Merge pull request #1046 from tomlanyon:crio PiperOrigin-RevId: 276172466	2019-10-22 17:05:04 -07:00
Tom Lanyon	7e8b5f4a3a	Add runsc OCI annotations to support CRI-O. Obligatory https://xkcd.com/927 Fixes #626	2019-10-20 21:11:01 +11:00
Ian Lewis	065339193e	Update TODO for OCI seccomp support. PiperOrigin-RevId: 274042343	2019-10-10 14:43:03 -07:00
gVisor bot	90e908f419	Merge pull request #917 from KentaTada:fix-clone-flags PiperOrigin-RevId: 272262368	2019-10-01 12:06:30 -07:00
Fabricio Voznika	0b02c3d5e5	Prevent CAP_NET_RAW from appearing in exec 'docker exec' was getting CAP_NET_RAW even when --net-raw=false because it was not filtered out from when copying container's capabilities. PiperOrigin-RevId: 272260451	2019-10-01 11:49:49 -07:00
Kenta Tada	69f3c79b24	runsc: add the clone flag of cgroup namespace Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>	2019-09-26 12:02:01 +09:00
Fabricio Voznika	010b093258	Bring back to life features lost in recent refactor - Sandbox logs are generated when running tests - Kokoro uploads the sandbox logs - Supports multiple parallel runs - Revive script to install locally built runsc with docker PiperOrigin-RevId: 269337274	2019-09-16 08:17:00 -07:00
Ian Lewis	0bfffbcb01	Ignore the root container when calculating oom_score_adj for the sandbox. This is done because the root container for CRI is the infrastructure (pause) container and always gets a low oom_score_adj. We do this to ensure that only the oom_score_adj of user containers is used to calculated the sandbox oom_score_adj. Implemented in runsc rather than the containerd shim as it's a bit cleaner to implement here (in the shim it would require overwriting the oomScoreAdj and re-writing out the config.json again). This processing is Kubernetes(CRI) specific but we are currently only supporting CRI for multi-container support anyway. PiperOrigin-RevId: 267507706	2019-09-05 19:21:25 -07:00
Fabricio Voznika	b461be88a8	Stops container if gofer is killed Each gofer now has a goroutine that polls on the FDs used to communicate with the sandbox. The respective gofer is destroyed if any of the FDs is closed. Closes #601 PiperOrigin-RevId: 261383725	2019-08-02 13:47:55 -07:00
Michael Pratt	5b41ba5d0e	Fix various spelling issues in the documentation Addresses obvious typos, in the documentation only. COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/443 from Pixep:fix/documentation-spelling 4d0688164eafaf0b3010e5f4824b35d1e7176d65 PiperOrigin-RevId: 255477779	2019-06-27 14:25:50 -07:00
Adin Scannell	add40fd6ad	Update canonical repository. This can be merged after: https://github.com/google/gvisor-website/pull/77 or https://github.com/google/gvisor-website/pull/78 PiperOrigin-RevId: 253132620	2019-06-13 16:50:15 -07:00
Fabricio Voznika	356d1be140	Allow 'runsc do' to run without root '--rootless' flag lets a non-root user execute 'runsc do'. The drawback is that the sandbox and gofer processes will run as root inside a user namespace that is mapped to the caller's user, intead of nobody. And network is defaulted to '--network=host' inside the root network namespace. On the bright side, it's very convenient for testing: runsc --rootless do ls runsc --rootless do curl www.google.com PiperOrigin-RevId: 252840970	2019-06-12 09:41:50 -07:00
Fabricio Voznika	fc746efa9a	Add support to mount pod shared tmpfs mounts Parse annotations containing 'gvisor.dev/spec/mount' that gives hints about how mounts are shared between containers inside a pod. This information can be used to better inform how to mount these volumes inside gVisor. For example, a volume that is shared between containers inside a pod can be bind mounted inside the sandbox, instead of being two independent mounts. For now, this information is used to allow the same tmpfs mounts to be shared between containers which wasn't possible before. PiperOrigin-RevId: 252704037	2019-06-11 14:54:31 -07:00
Liu Hua	fc9f7e3590	tiny fix: avoid panicing when OpenSpec failed Signed-off-by: Liu Hua <sdu.liu@huawei.com> Change-Id: I11a4620394a10a7d92036b0341e0c21ad50bd122 PiperOrigin-RevId: 248621810	2019-05-16 16:20:42 -07:00
Michael Pratt	4d52a55201	Change copyright notice to "The gVisor Authors" Based on the guidelines at https://opensource.google.com/docs/releasing/authors/. 1. $ rg -l "Google LLC" \| xargs sed -i 's/Google LLC.*/The gVisor Authors./' 2. Manual fixup of "Google Inc" references. 3. Add AUTHORS file. Authors may request to be added to this file. 4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS. Fixes #209 PiperOrigin-RevId: 245823212 Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9	2019-04-29 14:26:23 -07:00
Nicolas Lacasse	f4ce43e1f4	Allow and document bug ids in gVisor codebase. PiperOrigin-RevId: 245818639 Change-Id: I03703ef0fb9b6675955637b9fe2776204c545789	2019-04-29 14:04:14 -07:00
Kevin Krakauer	43dff57b87	Make raw sockets a toggleable feature disabled by default. PiperOrigin-RevId: 245511019 Change-Id: Ia9562a301b46458988a6a1f0bbd5f07cbfcb0615	2019-04-26 16:51:46 -07:00
Andrei Vagin	88409e983c	gvisor: Add support for the MS_NOEXEC mount option https://github.com/google/gvisor/issues/145 PiperOrigin-RevId: 242044115 Change-Id: I8f140fe05e32ecd438b6be218e224e4b7fe05878	2019-04-04 17:43:53 -07:00
Adin Scannell	7543e9ec20	Add release hook and version flag PiperOrigin-RevId: 241421671 Change-Id: Ic0cebfe3efd458dc42c49f7f812c13318705199a	2019-04-01 16:18:43 -07:00
Fabricio Voznika	c7877b0a14	Fail in case mount option is unknown PiperOrigin-RevId: 239425816 Change-Id: I3b1479c61b4222c3931a416c4efc909157044330	2019-03-20 10:36:20 -07:00
Fabricio Voznika	e420cc3e5d	Add support for mount propagation Properly handle propagation options for root and mounts. Now usage of mount options shared, rshared, and noexec cause error to start. shared/ rshared breaks sandbox=>host isolation. slave however can be supported because changes propagate from host to sandbox. Root FS setup moved inside the gofer. Apart from simplifying the code, it keeps all mounts inside the namespace. And they are torn down when the namespace is destroyed (DestroyFS is no longer needed). PiperOrigin-RevId: 239037661 Change-Id: I8b5ee4d50da33c042ea34fa68e56514ebe20e6e0	2019-03-18 12:30:43 -07:00
Michael Pratt	2a0c69b19f	Remove license comments Nothing reads them and they can simply get stale. Generated with: $ sed -i "s/licenses($.$)./licenses(\1)/" **/BUILD PiperOrigin-RevId: 231818945 Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0	2019-01-31 11:12:53 -08:00
Andrei Vagin	5f08f8fd81	Don't bind-mount runsc into a sandbox mntns PiperOrigin-RevId: 230437407 Change-Id: Id9d8ceeb018aad2fe317407c78c6ee0f4b47aa2b	2019-01-22 16:46:42 -08:00
Fabricio Voznika	0d7023d581	Restore to original cgroup after sandbox and gofer processes are created The original code assumed that it was safe to join and not restore cgroup, but Container.Run will not exit after calling start, making cgroup cleanup fail because there were still processes inside the cgroup. PiperOrigin-RevId: 228529199 Change-Id: I12a48d9adab4bbb02f20d71ec99598c336cbfe51	2019-01-09 09:18:15 -08:00
Brian Geffon	d3bc79bc84	Open source system call tests. PiperOrigin-RevId: 224886231 Change-Id: I0fccb4d994601739d8b16b1d4e6b31f40297fb22	2018-12-10 14:42:34 -08:00
Nicolas Lacasse	845836c578	Internal change. PiperOrigin-RevId: 221848471 Change-Id: I882fbe5ce7737048b2e1f668848e9c14ed355665	2018-11-20 14:03:11 -08:00
Nicolas Lacasse	7b938ef78c	Internal change. PiperOrigin-RevId: 221462069 Change-Id: Id469ed21fe12e582c78340189b932989afa13c67	2018-11-14 09:58:43 -08:00
Nicolas Lacasse	7f558eda44	Internal change. PiperOrigin-RevId: 221343421 Change-Id: I418b5204c5ed4fe1e0af25ef36ee66b9b571928e	2018-11-13 15:17:19 -08:00
Adin Scannell	75cd70ecc9	Track paths and provide a rename hook. This change also adds extensive testing to the p9 package via mocks. The sanity checks and type checks are moved from the gofer into the core package, where they can be more easily validated. PiperOrigin-RevId: 218296768 Change-Id: I4fc3c326e7bf1e0e140a454cbacbcc6fd617ab55	2018-10-23 00:20:15 -07:00
Ian Gudger	8fce67af24	Use correct company name in copyright header PiperOrigin-RevId: 217951017 Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035	2018-10-19 16:35:11 -07:00
Fabricio Voznika	f3ffa4db52	Resolve mount paths while setting up root fs mount It's hard to resolve symlinks inside the sandbox because rootfs and mounts may be read-only, forcing us to create mount points inside lower layer of an overlay, before the volumes are mounted. Since the destination must already be resolved outside the sandbox when creating mounts, take this opportunity to rewrite the spec with paths resolved. "runsc boot" will use the "resolved" spec to load mounts. In addition, symlink traversals were disabled while mounting containers inside the sandbox. It haven't been able to write a good test for it. So I'm relying on manual tests for now. PiperOrigin-RevId: 217749904 Change-Id: I7ac434d5befd230db1488446cda03300cc0751a9	2018-10-18 12:42:24 -07:00
Fabricio Voznika	e68d86e1bd	Make debug log file name configurable This is a breaking change if you're using --debug-log-dir. The fix is to replace it with --debug-log and add a '/' at the end: --debug-log-dir=/tmp/runsc ==> --debug-log=/tmp/runsc/ PiperOrigin-RevId: 216761212 Change-Id: I244270a0a522298c48115719fa08dad55e34ade1	2018-10-11 14:29:37 -07:00
Fabricio Voznika	29cd05a7c6	Add sandbox to cgroup Sandbox creation uses the limits and reservations configured in the OCI spec and set cgroup options accordinly. Then it puts both the sandbox and gofer processes inside the cgroup. It also allows the cgroup to be pre-configured by the caller. If the cgroup already exists, sandbox and gofer processes will join the cgroup but it will not modify the cgroup with spec limits. PiperOrigin-RevId: 216538209 Change-Id: If2c65ffedf55820baab743a0edcfb091b89c1019	2018-10-10 09:00:42 -07:00
Fabricio Voznika	cf226d48ce	Switch to root in userns when CAP_SYS_CHROOT is also missing Some tests check current capabilities and re-run the tests as root inside userns if required capabibilities are missing. It was checking for CAP_SYS_ADMIN only, CAP_SYS_CHROOT is also required now. PiperOrigin-RevId: 214949226 Change-Id: Ic81363969fa76c04da408fae8ea7520653266312	2018-09-28 09:44:13 -07:00
Fabricio Voznika	e395273301	Fix sandbox and gofer capabilities Capabilities.Set() adds capabilities, but doesn't remove existing ones that might have been loaded. Fixed the code and added tests. PiperOrigin-RevId: 213726369 Change-Id: Id7fa6fce53abf26c29b13b9157bb4c6616986fba	2018-09-19 17:15:14 -07:00
Lingfu	f0a92b6b67	Add docker command line args support for --cpuset-cpus and --cpus `docker run --cpuset-cpus=/--cpus=` will generate cpu resource info in config.json (runtime spec file). When nginx worker_connections is configured as auto, the worker is generated according to the number of CPUs. If the cgroup is already set on the host, but it is not displayed correctly in the sandbox, performance may be degraded. This patch can get cpus info from spec file and apply to sentry on bootup, so the /proc/cpuinfo can show the correct cpu numbers. `lscpu` and other commands rely on `/sys/devices/system/cpu/online` are also affected by this patch. e.g. --cpuset-cpus=2,3 -> cpu number:2 --cpuset-cpus=4-7 -> cpu number:4 --cpus=2.8 -> cpu number:3 --cpus=0.5 -> cpu number:1 Change-Id: Ideb22e125758d4322a12be7c51795f8018e3d316 PiperOrigin-RevId: 213685199	2018-09-19 13:35:42 -07:00
Nicolas Lacasse	9751b800a6	runsc: Support multi-container exec. We must use a context.Context with a Root Dirent that corresponds to the container's chroot. Previously we were using the root context, which does not have a chroot. Getting the correct context required refactoring some of the path-lookup code. We can't lookup the path without a context.Context, which requires kernel.CreateProcArgs, which we only get inside control.Execute. So we have to do the path lookup much later than we previously were. PiperOrigin-RevId: 212064734 Change-Id: I84a5cfadacb21fd9c3ab9c393f7e308a40b9b537	2018-09-07 17:39:54 -07:00
Nicolas Lacasse	210c252089	runsc: Run sandbox process inside minimal chroot. We construct a dir with the executable bind-mounted at /exe, and proc mounted at /proc. Runsc now executes the sandbox process inside this chroot, thus limiting access to the host filesystem. The mounts and chroot dir are removed when the sandbox is destroyed. Because this requires bind-mounts, we can only do the chroot if we have CAP_SYS_ADMIN. PiperOrigin-RevId: 211994001 Change-Id: Ia71c515e26085e0b69b833e71691830148bc70d1	2018-09-07 10:16:39 -07:00
Nicolas Lacasse	f96b33c73c	runsc: Promote getExecutablePathInternal to getExecutablePath. Remove GetExecutablePath (the non-internal version). This makes path handling more consistent between exec, root, and child containers. The new getExecutablePath now uses MountNamespace.FindInode, which is more robust than Walking the Dirent tree ourselves. This also removes the last use of lstat(2) in the sentry, so that can be removed from the filters. PiperOrigin-RevId: 211683110 Change-Id: Ic8ec960fc1c267aa7d310b8efe6e900c88a9207a	2018-09-05 13:01:21 -07:00

1 2

69 Commits