gvisor

Commit Graph

Author	SHA1	Message	Date
Nicolas Lacasse	d489336784	runsc: All non-root bind mounts should be shared. This CL changes the semantics of the "--file-access" flag so that it only affects the root filesystem. The default remains "exclusive" which is the common use case, as neither Docker nor K8s supports sharing the root. Keeping the root fs as "exclusive" means that the fs-intensive work done during application startup will mostly be cacheable, and thus faster. Non-root bind mounts will always be shared. This CL also removes some redundant FSAccessType validations. We validate this flag in main(), so we can assume it is valid afterwards. PiperOrigin-RevId: 214359936 Change-Id: I7e75d7bf52dbd7fa834d0aacd4034868314f3b51	2018-09-24 17:22:15 -07:00
Ian Gudger	7ce13ebcad	Run gofmt -s on everything PiperOrigin-RevId: 214040901 Change-Id: I74d79497a053da3624921ad2b7c5193ca4a87942	2018-09-21 14:06:59 -07:00
Nicolas Lacasse	d260e808f4	The "action" in container.Signal should be "signal". PiperOrigin-RevId: 214038776 Change-Id: I4ad212540ec4ef4fb5ab5fdcb7f0865c4f746895	2018-09-21 13:54:35 -07:00
Nicolas Lacasse	b4321f4447	runsc: Synchronize container metadata changes with a file lock. Each container has associated metadata (particularly the container status) that is manipulated by various runsc commands. This metadata is stored in a file identified by the container id. Different runsc processes may manipulate the same container metadata, and each will read/write to the metadata file. This CL adds a file lock per container which must be held when reading the container metadata file, and when modifying and writing the container metadata. PiperOrigin-RevId: 214019179 Change-Id: Ice4390ad233bc7f216c9a9a6cf05fb456c9ec0ad	2018-09-21 11:42:06 -07:00
Fabricio Voznika	b63c4bfe02	Set Sandbox.Chroot so it gets cleaned up upon destruction I've made several attempts to create a test, but the lack of permission from the test user makes it nearly impossible to test anything useful. PiperOrigin-RevId: 213922174 Change-Id: I5b502ca70cb7a6645f8836f028fb203354b4c625	2018-09-20 18:54:09 -07:00
Lantao Liu	8a938a3f9d	runsc: allow `runsc wait` on a container for multiple times. PiperOrigin-RevId: 213908919 Change-Id: I74eff99a5360bb03511b946f4cb5658bb5fc40c7	2018-09-20 16:59:42 -07:00
Nicolas Lacasse	cbaec4d614	Wait for all async fs operations to complete before returning from Destroy. Destroy flushes dirent references, which triggers many async close operations. We must wait for those to finish before returning from Destroy, otherwise we may kill the gofer, causing a cascade of failing RPCs and leading to an inconsistent FS state. PiperOrigin-RevId: 213884637 Change-Id: Id054b47fc0f97adc5e596d747c08d3b97a1d1f71	2018-09-20 14:37:53 -07:00
Lantao Liu	9464b82a06	runsc: Fix a bug that `runsc wait` doesn't work after container exits. PiperOrigin-RevId: 213849165 Change-Id: I5120b2f568850c0c42a08e8706e7f8653ef1bd94	2018-09-20 11:23:26 -07:00
Kevin Krakauer	ffb5fdd690	runsc: Fix stdin/stdout/stderr in multi-container mode. The issue with the previous change was that the stdin/stdout/stderr passed to the sentry were dup'd by host.ImportFile. This left a dangling FD that by never closing caused containerd to timeout waiting on container stop. PiperOrigin-RevId: 213753032 Change-Id: Ia5e4c0565c42c8610d3b59f65599a5643b0901e4	2018-09-19 22:20:41 -07:00
Nicolas Lacasse	915d76aa92	Add container.Destroy urpc method. This method will: 1. Stop the container process if it is still running. 2. Unmount all sanadbox-internal mounts for the container. 3. Delete the contaner root directory inside the sandbox. Destroy is idempotent, and safe to call concurrantly. This fixes a bug where after stopping a container, we cannot unmount the container root directory on the host. This bug occured because the sandbox dirent cache was holding a dirent with a host fd corresponding to a file inside the container root on the host. The dirent cache did not know that the container had exited, and kept the FD open, preventing us from unmounting on the host. Now that we unmount (and flush) all container mounts inside the sandbox, any host FDs donated by the gofer will be closed, and we can unmount the container root on the host. PiperOrigin-RevId: 213737693 Change-Id: I28c0ff4cd19a08014cdd72fec5154497e92aacc9	2018-09-19 18:54:14 -07:00
Kevin Krakauer	639226c3d9	runsc: Mark container_test flaky. PiperOrigin-RevId: 213732520 Change-Id: Ife292987ec8b1de4c2e7e3b7d4452b00c1582e91	2018-09-19 18:03:35 -07:00
Fabricio Voznika	e395273301	Fix sandbox and gofer capabilities Capabilities.Set() adds capabilities, but doesn't remove existing ones that might have been loaded. Fixed the code and added tests. PiperOrigin-RevId: 213726369 Change-Id: Id7fa6fce53abf26c29b13b9157bb4c6616986fba	2018-09-19 17:15:14 -07:00
Nicolas Lacasse	2ad3228cd0	runsc: Don't create __runsc_containers__ unless we are in multi-container mode. PiperOrigin-RevId: 213715511 Change-Id: I3e41b583c6138edbdeba036dfb9df4864134fc12	2018-09-19 16:10:47 -07:00
Lingfu	f0a92b6b67	Add docker command line args support for --cpuset-cpus and --cpus `docker run --cpuset-cpus=/--cpus=` will generate cpu resource info in config.json (runtime spec file). When nginx worker_connections is configured as auto, the worker is generated according to the number of CPUs. If the cgroup is already set on the host, but it is not displayed correctly in the sandbox, performance may be degraded. This patch can get cpus info from spec file and apply to sentry on bootup, so the /proc/cpuinfo can show the correct cpu numbers. `lscpu` and other commands rely on `/sys/devices/system/cpu/online` are also affected by this patch. e.g. --cpuset-cpus=2,3 -> cpu number:2 --cpuset-cpus=4-7 -> cpu number:4 --cpus=2.8 -> cpu number:3 --cpus=0.5 -> cpu number:1 Change-Id: Ideb22e125758d4322a12be7c51795f8018e3d316 PiperOrigin-RevId: 213685199	2018-09-19 13:35:42 -07:00
Fabricio Voznika	8aec7473a1	Added state machine checks for Container.Status For my own sanitity when thinking about possible transitions and state. PiperOrigin-RevId: 213559482 Change-Id: I25588c86cf6098be4eda01f4e7321c102ceef33c	2018-09-18 19:12:54 -07:00
Fabricio Voznika	7967d8ecd5	Handle children processes better in tests Reap children more systematically in container tests. Previously, container_test was taking ~5 mins to run because constainer.Destroy() would timeout waiting for the sandbox process to exit. Now the test running in less than a minute. Also made the contract around Container and Sandbox destroy clearer. PiperOrigin-RevId: 213527471 Change-Id: Icca84ee1212bbdcb62bdfc9cc7b71b12c6d1688d	2018-09-18 15:21:28 -07:00
Kevin Krakauer	7e00f37054	Automated rollback of changelist 213307171 PiperOrigin-RevId: 213504354 Change-Id: Iadd42f0ca4b7e7a9eae780bee9900c7233fb4f3f	2018-09-18 13:22:26 -07:00
Fabricio Voznika	5d9816be41	Remove memory usage static init panic() during init() can be hard to debug. Updates #100 PiperOrigin-RevId: 213391932 Change-Id: Ic103f1981c5b48f1e12da3b42e696e84ffac02a9	2018-09-17 21:34:37 -07:00
Fabricio Voznika	26b08e182c	Rename container in test 's' used to stand for sandbox, before container exited. PiperOrigin-RevId: 213390641 Change-Id: I7bda94a50398c46721baa92227e32a7a1d817412	2018-09-17 21:18:27 -07:00
Kevin Krakauer	bb88c187c5	runsc: Enable waiting on exited processes. This makes `runsc wait` behave more like waitpid()/wait4() in that: - Once a process has run to completion, you can wait on it and get its exit code. - Processes not waited on will consume memory (like a zombie process) PiperOrigin-RevId: 213358916 Change-Id: I5b5eca41ce71eea68e447380df8c38361a4d1558	2018-09-17 16:25:24 -07:00
Kevin Krakauer	25add7b22b	runsc: Fix stdin/out/err in multi-container mode. Stdin/out/err weren't being sent to the sentry. PiperOrigin-RevId: 213307171 Change-Id: Ie4b634a58b1b69aa934ce8597e5cc7a47a2bcda2	2018-09-17 11:31:28 -07:00
Lantao Liu	bde2a91433	runsc: Support container signal/wait. This CL: 1) Fix `runsc wait`, it now also works after the container exits; 2) Generate correct container state in Load; 2) Make sure `Destory` cleanup everything before successfully return. PiperOrigin-RevId: 212900107 Change-Id: Ie129cbb9d74f8151a18364f1fc0b2603eac4109a	2018-09-13 16:38:03 -07:00
Kevin Krakauer	2eff1fdd06	runsc: Add exec flag that specifies where to save the sandbox-internal pid. This is different from the existing -pid-file flag, which saves a host pid. PiperOrigin-RevId: 212713968 Change-Id: I2c486de8dd5cfd9b923fb0970165ef7c5fc597f0	2018-09-12 15:23:35 -07:00
Michael Pratt	0efde2bfbd	Remove getdents from filters It was only used by whitelistfs, which was removed in `bc81f3fe4a`. PiperOrigin-RevId: 212666374 Change-Id: Ia35e6dc9d68c1a3b015d5b5f71ea3e68e46c5bed	2018-09-12 10:51:25 -07:00
Michael Pratt	b4aed01bf2	Rollback of changelist 212483372 PiperOrigin-RevId: 212557844 Change-Id: I414de848e75d57ecee2c05e851d05b607db4aa57	2018-09-11 17:54:50 -07:00
Nicolas Lacasse	6cc9b311af	platform: Pass device fd into platform constructor. We were previously openining the platform device (i.e. /dev/kvm) inside the platfrom constructor (i.e. kvm.New). This requires that we have RW access to the platform device when constructing the platform. However, now that the runsc sandbox process runs as user "nobody", it is not able to open the platform device. This CL changes the kvm constructor to take the platform device FD, rather than opening the device file itself. The device file is opened outside of the sandbox and passed to the sandbox process. PiperOrigin-RevId: 212505804 Change-Id: I427e1d9de5eb84c84f19d513356e1bb148a52910	2018-09-11 13:09:46 -07:00
Fabricio Voznika	c44bc6612f	Allow fstatat back in syscall filters PiperOrigin-RevId: 212483372 Change-Id: If95f32a8e41126cf3dc8bd6c8b2fb0fcfefedc6d	2018-09-11 11:05:09 -07:00
Nicolas Lacasse	e198f9ab02	runsc: Chmod all mounted files to 777 inside chroot. Inside the chroot, we run as user nobody, so all mounted files and directories must be accessible to all users. PiperOrigin-RevId: 212284805 Change-Id: I705e0dbbf15e01e04e0c7f378a99daffe6866807	2018-09-10 10:00:16 -07:00
Nicolas Lacasse	0c0c942327	Automated rollback of changelist 212059579 PiperOrigin-RevId: 212069131 Change-Id: I01476f957bbf29d4ee5a3c11d59d4f863ba9f2df	2018-09-07 18:23:27 -07:00
Nicolas Lacasse	922d8c3c8c	Automated rollback of changelist 211992321 PiperOrigin-RevId: 212066419 Change-Id: Icded56e7e117bfd9b644e6541bddcd110460a9b8	2018-09-07 17:56:07 -07:00
Nicolas Lacasse	9751b800a6	runsc: Support multi-container exec. We must use a context.Context with a Root Dirent that corresponds to the container's chroot. Previously we were using the root context, which does not have a chroot. Getting the correct context required refactoring some of the path-lookup code. We can't lookup the path without a context.Context, which requires kernel.CreateProcArgs, which we only get inside control.Execute. So we have to do the path lookup much later than we previously were. PiperOrigin-RevId: 212064734 Change-Id: I84a5cfadacb21fd9c3ab9c393f7e308a40b9b537	2018-09-07 17:39:54 -07:00
Fabricio Voznika	cf5006ff24	Disable test until we figure out what's broken PiperOrigin-RevId: 212059579 Change-Id: I052c2192d3483d7bd0fd2232ef2023a12da66446	2018-09-07 17:00:41 -07:00
Adin Scannell	6cfb5cd56d	Add additional sanity checks for walk. PiperOrigin-RevId: 212058684 Change-Id: I319709b9ffcfccb3231bac98df345d2a20eca24b	2018-09-07 16:53:12 -07:00
Fabricio Voznika	8ce3fbf9f8	Only start signal forwarding after init process is created PiperOrigin-RevId: 212028121 Change-Id: If9c2c62f3be103e2bb556b8d154c169888e34369	2018-09-07 13:39:12 -07:00
Fabricio Voznika	bc81f3fe4a	Remove '--file-access=direct' option It was used before gofer was implemented and it's not supported anymore. BREAKING CHANGE: proxy-shared and proxy-exclusive options are now: shared and exclusive. PiperOrigin-RevId: 212017643 Change-Id: If029d4073fe60583e5ca25f98abb2953de0d78fd	2018-09-07 12:28:48 -07:00
Fabricio Voznika	f895cb4d8b	Use root abstract socket namespace for exec PiperOrigin-RevId: 211999211 Change-Id: I5968dd1a8313d3e49bb6e6614e130107495de41d	2018-09-07 10:45:55 -07:00
Nicolas Lacasse	210c252089	runsc: Run sandbox process inside minimal chroot. We construct a dir with the executable bind-mounted at /exe, and proc mounted at /proc. Runsc now executes the sandbox process inside this chroot, thus limiting access to the host filesystem. The mounts and chroot dir are removed when the sandbox is destroyed. Because this requires bind-mounts, we can only do the chroot if we have CAP_SYS_ADMIN. PiperOrigin-RevId: 211994001 Change-Id: Ia71c515e26085e0b69b833e71691830148bc70d1	2018-09-07 10:16:39 -07:00
Nicolas Lacasse	590d832099	runsc: Dup debug log file to stderr, so sentry panics don't get lost. Docker and containerd do not expose runsc's stderr, so tracking down sentry panics can be painful. If we have a debug log file, we should send panics (and all stderr data) to the log file. PiperOrigin-RevId: 211992321 Change-Id: I5f0d2f45f35c110a38dab86bafc695aaba42f7a3	2018-09-07 10:05:21 -07:00
Lantao Liu	4f3053cb4e	runsc: do not delete in paused state. PiperOrigin-RevId: 211835570 Change-Id: Ied7933732cad5bc60b762e9c964986cb49a8d9b9	2018-09-06 11:06:19 -07:00
Fabricio Voznika	efac28976c	Enable network for multi-container PiperOrigin-RevId: 211834411 Change-Id: I52311a6c5407f984e5069359d9444027084e4d2a	2018-09-06 11:00:08 -07:00
Kevin Krakauer	d95663a6b9	runsc testing: Move TestMultiContainerSignal to multi_container_test. PiperOrigin-RevId: 211831396 Change-Id: Id67f182cb43dccb696180ec967f5b96176f252e0	2018-09-06 10:41:55 -07:00
Kevin Krakauer	8f0b6e7fc0	runsc: Support runsc kill multi-container. Now, we can kill individual containers rather than the entire sandbox. PiperOrigin-RevId: 211748106 Change-Id: Ic97e91db33d53782f838338c4a6d0aab7a313ead	2018-09-05 21:14:56 -07:00
Fabricio Voznika	5f0002fc83	Use container's capabilities in exec When no capabilities are specified in exec, use the container's capabilities to match runc's behavior. PiperOrigin-RevId: 211735186 Change-Id: Icd372ed64410c81144eae94f432dffc9fe3a86ce	2018-09-05 18:32:50 -07:00
Fabricio Voznika	12aef686af	Enabled bind mounts in sub-containers With multi-gofers, bind mounts in sub-containers should just work. Removed restrictions and added test. There are also a few cleanups along the way, e.g. retry unmounting in case cleanup races with gofer teardown. PiperOrigin-RevId: 211699569 Change-Id: Ic0a69c29d7c31cd7e038909cc686c6ac98703374	2018-09-05 14:30:09 -07:00
Fabricio Voznika	0c7cfca0da	Running container should have a valid sandbox PiperOrigin-RevId: 211693868 Change-Id: Iea340dd78bf26ae6409c310b63c17cc611c2055f	2018-09-05 14:02:45 -07:00
Fabricio Voznika	4b57fd920d	Add MADVISE to fsgofer seccomp profile PiperOrigin-RevId: 211686037 Change-Id: I0e776ca760b65ba100e495f471b6e811dbd6590a	2018-09-05 13:18:06 -07:00
Fabricio Voznika	1d22d87fdc	Move multi-container test to a single file PiperOrigin-RevId: 211685288 Change-Id: I7872f2a83fcaaa54f385e6e567af6e72320c5aa0	2018-09-05 13:13:26 -07:00
Nicolas Lacasse	f96b33c73c	runsc: Promote getExecutablePathInternal to getExecutablePath. Remove GetExecutablePath (the non-internal version). This makes path handling more consistent between exec, root, and child containers. The new getExecutablePath now uses MountNamespace.FindInode, which is more robust than Walking the Dirent tree ourselves. This also removes the last use of lstat(2) in the sentry, so that can be removed from the filters. PiperOrigin-RevId: 211683110 Change-Id: Ic8ec960fc1c267aa7d310b8efe6e900c88a9207a	2018-09-05 13:01:21 -07:00
Nicolas Lacasse	0a9a40abcd	runsc: Run sandbox as user nobody. When starting a sandbox without direct file or network access, we create an empty user namespace and run the sandbox in there. However, the root user in that namespace is still mapped to the root user in the parent namespace. This CL maps the "nobody" user from the parent namespace into the child namespace, and runs the sandbox process as user "nobody" inside the new namespace. PiperOrigin-RevId: 211572223 Change-Id: I1b1f9b1a86c0b4e7e5ca7bc93be7d4887678bab6	2018-09-04 20:33:05 -07:00
Nicolas Lacasse	ad8648c634	runsc: Pass log and config files to sandbox process by FD. This is a prereq for running the sandbox process as user "nobody", when it may not have permissions to open these files. Instead, we must open then before starting the sandbox process, and pass them by FD. The specutils.ReadSpecFromFile method was fixed to always seek to the beginning of the file before reading. This allows Files from the same FD to be read multiple times, as we do in the boot command when the apply-caps flag is set. Tested with --network=host. PiperOrigin-RevId: 211570647 Change-Id: I685be0a290aa7f70731ebdce82ebc0ebcc9d475c	2018-09-04 20:10:01 -07:00
Lantao Liu	9ae4e28f75	runsc: fix container rootfs path. PiperOrigin-RevId: 211515350 Change-Id: Ia495af57447c799909aa97bb873a50b87bee2625	2018-09-04 13:37:40 -07:00
Michael Pratt	ab7174611c	Remove epoll_wait from filters Go 1.11 replaced it with epoll_pwait. PiperOrigin-RevId: 211510006 Change-Id: I48a6cae95ed3d57a4633895358ad05ad8bf2f633	2018-09-04 13:10:09 -07:00
Fabricio Voznika	66c03b3dd7	Mounting over '/tmp' may fail PiperOrigin-RevId: 211160120 Change-Id: Ie5f280bdac17afd01cb16562ffff6222b3184c34	2018-08-31 16:12:08 -07:00
Fabricio Voznika	7713e2cb75	Remove not used deps PiperOrigin-RevId: 211147521 Change-Id: I9b8b67df50a3ba084c07a48c72a874d7e2007f23	2018-08-31 14:47:46 -07:00
Fabricio Voznika	7e18f158b2	Automated rollback of changelist 210995199 PiperOrigin-RevId: 211116429 Change-Id: I446d149c822177dc9fc3c64ce5e455f7f029aa82	2018-08-31 11:30:47 -07:00
Lantao Liu	be9f454eb6	runsc: Set volume mount rslave. PiperOrigin-RevId: 211111376 Change-Id: I27b8cb4e070d476fa4781ed6ecfa0cf1dcaf85f5	2018-08-31 11:03:22 -07:00
Michael Pratt	08bfb5643c	Add other missing dep runsc and runsc-race need the same deps. PiperOrigin-RevId: 211103766 Change-Id: Ib0c97078a469656c1e5b019648589a1d07915625	2018-08-31 10:22:09 -07:00
Fabricio Voznika	e669697241	Fix RunAsRoot arguments forwarding It was including the path to the executable twice in the arguments. PiperOrigin-RevId: 211098311 Change-Id: I5357c51c63f38dfab551b17bb0e04011a0575010	2018-08-31 09:45:32 -07:00
Tamir Duberstein	3f04bd68b2	Add missing import GoCompile: missing strict dependencies: /tmpfs/tmp/bazel/sandbox/linux-sandbox/1744/execroot/__main__/runsc/main.go: import of "gvisor.googlesource.com/gvisor/runsc/specutils" This was broken in 210995199. PiperOrigin-RevId: 211086595 Change-Id: I166b9a2ed8e4d6e624def944b720190940d7537c	2018-08-31 08:07:52 -07:00
Fabricio Voznika	3e493adf7a	Add seccomp filter to fsgofer PiperOrigin-RevId: 211011542 Change-Id: Ib5a83a00f8eb6401603c6fb5b59afc93bac52558	2018-08-30 17:30:19 -07:00
Nicolas Lacasse	5ade9350ad	runsc: Pass log and config files to sandbox process by FD. This is a prereq for running the sandbox process as user "nobody", when it may not have permissions to open these files. Instead, we must open then before starting the sandbox process, and pass them by FD. PiperOrigin-RevId: 210995199 Change-Id: I715875a9553290b4a49394a8fcd93be78b1933dd	2018-08-30 15:47:18 -07:00
Fabricio Voznika	30c025f3ef	Add argument checks to seccomp This is required to increase protection when running in GKE. PiperOrigin-RevId: 210635123 Change-Id: Iaaa8be49e73f7a3a90805313885e75894416f0b5	2018-08-28 17:10:03 -07:00
Michael Pratt	ea113a4380	Drop support for Go 1.10 PiperOrigin-RevId: 210589588 Change-Id: Iba898bc3eb8f13e17c668ceea6dc820fc8180a70	2018-08-28 12:56:28 -07:00
Lantao Liu	d8f0db9bcf	runsc: unmount volume mounts when destroy container. PiperOrigin-RevId: 210579178 Change-Id: Iae20639c5186b1a976cbff6d05bda134cd00d0da	2018-08-28 11:54:07 -07:00
Fabricio Voznika	f7366e4e64	Consolidate image tests into a single file This is to keep it consistent with other test, and it's easier to maintain them in single file. Also increase python test timeout to deflake it. PiperOrigin-RevId: 210575042 Change-Id: I2ef5bcd5d97c08549f0c5f645c4b694253ef0b4d	2018-08-28 11:31:04 -07:00
Fabricio Voznika	ae648bafda	Add command-line parameter to trigger panic on signal This is to troubleshoot problems with a hung process that is not responding to 'runsc debug --stack' command. PiperOrigin-RevId: 210483513 Change-Id: I4377b210b4e51bc8a281ad34fd94f3df13d9187d	2018-08-27 20:36:10 -07:00
Kevin Krakauer	a4529c1b5b	runsc: Fix readonly filesystem causing failure to create containers. For readonly filesystems specified via relative path, we were forgetting to mount relative to the container's bundle directory. PiperOrigin-RevId: 210483388 Change-Id: I84809fce4b1f2056d0e225547cb611add5f74177	2018-08-27 20:34:27 -07:00
Nicolas Lacasse	0b3bfe2ea3	fs: Fix remote-revalidate cache policy. When revalidating a Dirent, if the inode id is the same, then we don't need to throw away the entire Dirent. We can just update the unstable attributes in place. If the inode id has changed, then the remote file has been deleted or moved, and we have no choice but to throw away the dirent we have a look up another. In this case, we may still end up losing a mounted dirent that is a child of the revalidated dirent. However, that seems appropriate here because the entire mount point has been pulled out from underneath us. Because gVisor's overlay is at the Inode level rather than the Dirent level, we must pass the parent Inode and name along with the Inode that is being revalidated. PiperOrigin-RevId: 210431270 Change-Id: I705caef9c68900234972d5aac4ae3a78c61c7d42	2018-08-27 14:26:29 -07:00
Nicolas Lacasse	5999767d53	runsc: fsgofer should return a unique QID.Path for each file. Previously, we were only using the host inode id as the QID path. But the host filesystem can have multiple devices with conflicting inode ids. This resulted in duplicate inode ids in the sentry. This CL generates a unique QID for each <host inode, host device> pair. PiperOrigin-RevId: 210424813 Change-Id: I16d106f61c7c8f910c0da4ceec562a010ffca2fb	2018-08-27 13:52:14 -07:00
Adin Scannell	b9ded9bf39	Add runsc-race target. PiperOrigin-RevId: 210422178 Change-Id: I984dd348d467908bc3180a20fc79b8387fcca05e	2018-08-27 13:37:03 -07:00
Fabricio Voznika	db81c0b02f	Put fsgofer inside chroot Now each container gets its own dedicated gofer that is chroot'd to the rootfs path. This is done to add an extra layer of security in case the gofer gets compromised. PiperOrigin-RevId: 210396476 Change-Id: Iba21360a59dfe90875d61000db103f8609157ca0	2018-08-27 11:10:14 -07:00
Nicolas Lacasse	106de2182d	runsc: Terminal support for "docker exec -ti". This CL adds terminal support for "docker exec". We previously only supported consoles for the container process, but not exec processes. The SYS_IOCTL syscall was added to the default seccomp filter list, but only for ioctls that get/set winsize and termios structs. We need to allow these ioctl for all containers because it's possible to run "exec -ti" on a container that was started without an attached console, after the filters have been installed. Note that control-character signals are still not properly supported. Tested with: $ docker run --runtime=runsc -it alpine In another terminial: $ docker exec -it <containerid> /bin/sh PiperOrigin-RevId: 210185456 Change-Id: I6d2401e53a7697bb988c120a8961505c335f96d9	2018-08-24 17:43:21 -07:00
Kevin Krakauer	02dfceab6d	runsc: Allow runsc to properly search the PATH for executable name. Previously, runsc improperly attempted to find an executable in the container's PATH. We now search the PATH via the container's fsgofer rather than the host FS, eliminating the confusing differences between paths on the host and within a container. PiperOrigin-RevId: 210159488 Change-Id: I228174dbebc4c5356599036d6efaa59f28ff28d2	2018-08-24 14:42:40 -07:00
Fabricio Voznika	a81a4402a2	Add option to panic gofer if writes are attempted over RO mounts This is used when '--overlay=true' to guarantee writes are not sent to gofer. PiperOrigin-RevId: 210116288 Change-Id: I7616008c4c0e8d3668e07a205207f46e2144bf30	2018-08-24 10:17:42 -07:00
Fabricio Voznika	001a4c2493	Clean up syscall filters Removed syscalls that are only used by whitelistfs which has its own set of filters. PiperOrigin-RevId: 209967259 Change-Id: Idb2e1b9d0201043d7cd25d96894f354729dbd089	2018-08-23 11:15:07 -07:00
Kevin Krakauer	a78df1d874	runsc: De-flakes container_test TestMultiContainerSanity. The bug was caused by os.File's finalizer, which closes the file. Because fsgofer.serve() was passed a file descriptor as an int rather than a os.File, callers would pass os.File.Fd(), and the os.File would go out of scope. Thus, the file would get GC'd and finalized nondeterministically, causing failures when the file was used. PiperOrigin-RevId: 209861834 Change-Id: Idf24d5c1f04c9b28659e62c97202ab3b4d72e994	2018-08-22 17:55:15 -07:00
Fabricio Voznika	e2ab7ec39e	Fix TestUnixDomainSockets failure when path is too large UDS has a lower size limit than regular files. When running under bazel this limit is exceeded. Test was changed to always mount /tmp and use it for the test. PiperOrigin-RevId: 209717830 Change-Id: I1dbe19fe2051ffdddbaa32b188a9167f446ed193	2018-08-21 23:07:39 -07:00
Kevin Krakauer	ae68e9e751	Temporarily skip multi-container tests in container_test until deflaked. PiperOrigin-RevId: 209679235 Change-Id: I527e779eeb113d0c162f5e27a2841b9486f0e39f	2018-08-21 16:21:05 -07:00
Fabricio Voznika	19ef2ad1fe	nonExclusiveFS is causing timeout with --race Not sure why, just removed for now to unblock the tests. PiperOrigin-RevId: 209661403 Change-Id: I72785c071687d54e22bda9073d36b447d52a7018	2018-08-21 14:35:08 -07:00
Fabricio Voznika	a854678bc3	Move container_test to the container package PiperOrigin-RevId: 209655274 Change-Id: Id381114bdb3197c73e14f74b3f6cf1afd87d60cb	2018-08-21 14:02:19 -07:00
Fabricio Voznika	d6d165cb0b	Initial change for multi-gofer support PiperOrigin-RevId: 209647293 Change-Id: I980fca1257ea3fcce796388a049c353b0303a8a5	2018-08-21 13:14:43 -07:00
Fabricio Voznika	0fc7b30695	Standardize mounts in tests Tests get a readonly rootfs mapped to / (which was the case before) and writable TEST_TMPDIR. This makes it easier to setup containers to write to files and to share state between test and containers. PiperOrigin-RevId: 209453224 Change-Id: I4d988e45dc0909a0450a3bb882fe280cf9c24334	2018-08-20 11:26:39 -07:00
Fabricio Voznika	11800311a5	Add nonExclusiveFS dimension to more tests The ones using 'kvm' actually mean that they don't want overlay. PiperOrigin-RevId: 209194318 Change-Id: I941a443cb6d783e2c80cf66eb8d8630bcacdb574	2018-08-17 13:07:09 -07:00
Fabricio Voznika	da087e66cc	Combine functions to search for file under one common function Bazel adds the build type in front of directories making it hard to refer to binaries in code. PiperOrigin-RevId: 209010854 Change-Id: I6c9da1ac3bbe79766868a3b14222dd42d03b4ec5	2018-08-16 10:55:45 -07:00
Kevin Krakauer	635b0c4593	runsc fsgofer: Support dynamic serving of filesystems. When multiple containers run inside a sentry, each container has its own root filesystem and set of mounts. Containers are also added after sentry boot rather than all configured and known at boot time. The fsgofer needs to be able to serve the root filesystem of each container. Thus, it must be possible to add filesystems after the fsgofer has already started. This change: * Creates a URPC endpoint within the gofer process that listens for requests to serve new content. * Enables the sentry, when starting a new container, to add the new container's filesystem. * Mounts those new filesystems at separate roots within the sentry. PiperOrigin-RevId: 208903248 Change-Id: Ifa91ec9c8caf5f2f0a9eead83c4a57090ce92068	2018-08-15 16:25:22 -07:00
Nicolas Lacasse	2033f61aae	runsc: Fix instances of file access "proxy". This file access type is actually called "proxy-shared", but I forgot to update all locations. PiperOrigin-RevId: 208832491 Change-Id: I7848bc4ec2478f86cf2de1dcd1bfb5264c6276de	2018-08-15 09:34:18 -07:00
Nicolas Lacasse	e8a4f2e133	runsc: Change cache policy for root fs and volume mounts. Previously, gofer filesystems were configured with the default "fscache" policy, which caches filesystem metadata and contents aggressively. While this setting is best for performance, it means that changes from inside the sandbox may not be immediately propagated outside the sandbox, and vice-versa. This CL changes volumes and the root fs configuration to use a new "remote-revalidate" cache policy which tries to retain as much caching as possible while still making fs changes visible across the sandbox boundary. This cache policy is enabled by default for the root filesystem. The default value for the "--file-access" flag is still "proxy", but the behavior is changed to use the new cache policy. A new value for the "--file-access" flag is added, called "proxy-exclusive", which turns on the previous aggressive caching behavior. As the name implies, this flag should be used when the sandbox has "exclusive" access to the filesystem. All volume mounts are configured to use the new cache policy, since it is safest and most likely to be correct. There is not currently a way to change this behavior, but it's possible to add such a mechanism in the future. The configurability is a smaller issue for volumes, since most of the expensive application fs operations (walking + stating files) will likely served by the root fs. PiperOrigin-RevId: 208735037 Change-Id: Ife048fab1948205f6665df8563434dbc6ca8cfc9	2018-08-14 16:25:58 -07:00
Nicolas Lacasse	36c940b093	Move checkpoint/restore readme to g3doc directory. PiperOrigin-RevId: 208282383 Change-Id: Ifa4aaf5d925b17d9a0672ea951a4570d35855300	2018-08-10 15:57:49 -07:00
Brielle Broder	f213a5e0fd	README for Checkpoint/Restore. PiperOrigin-RevId: 208274833 Change-Id: Iddda875a87205f7b8fa6f5c60b547522b94a6696	2018-08-10 15:08:26 -07:00
Brielle Broder	4ececd8e8d	Enable checkpoint/restore in cases of UDS use. Previously, processes which used file-system Unix Domain Sockets could not be checkpoint-ed in runsc because the sockets were saved with their inode numbers which do not necessarily remain the same upon restore. Now, the sockets are also saved with their paths so that the new inodes can be determined for the sockets based on these paths after restoring. Tests for cases with UDS use are included. Test cleanup to come. PiperOrigin-RevId: 208268781 Change-Id: Ieaa5d5d9a64914ca105cae199fd8492710b1d7ec	2018-08-10 14:33:20 -07:00
Fabricio Voznika	0ac912f99e	Fix runsc integration_test when using --network=host inethost doesn't support netlink and 'ifconfig' call to retrieve IP address fails. Look up IP address in /etc/hosts instead. PiperOrigin-RevId: 208135641 Change-Id: I3c2ce15db6fc7c3306a45e4bfb9cc5d4423ffad3	2018-08-09 17:05:24 -07:00
Fabricio Voznika	4e171f7590	Basic support for ip link/addr and ifconfig Closes #94 PiperOrigin-RevId: 207997580 Change-Id: I19b426f1586b5ec12f8b0cd5884d5b401d334924	2018-08-08 22:39:58 -07:00
Fabricio Voznika	ea1e39a314	Resend packets back to netstack if destined to itself Add option to redirect packet back to netstack if it's destined to itself. This fixes the problem where connecting to the local NIC address would not work, e.g.: echo bar \| nc -l -p 8080 & echo foo \| nc 192.168.0.2 8080 PiperOrigin-RevId: 207995083 Change-Id: I17adc2a04df48bfea711011a5df206326a1fb8ef	2018-08-08 22:03:35 -07:00
Fabricio Voznika	0d350aac7f	Enable SACK in runsc SACK is disabled by default and needs to be manually enabled. It not only improves performance, but also fixes hangs downloading files from certain websites. PiperOrigin-RevId: 207906742 Change-Id: I4fb7277b67bfdf83ac8195f1b9c38265a0d51e8b	2018-08-08 10:26:18 -07:00
Fabricio Voznika	cb23232c37	Fix build break in test integration_test runs manually and breakage wasn't detected. Added test to kokoro to ensure breakages are detected in the future. PiperOrigin-RevId: 207772835 Change-Id: Iada81b579b558477d4db3516b38366ef6a2e933d	2018-08-07 13:48:35 -07:00
Fabricio Voznika	9752174a7f	Disable KVM dimension because it's making the test flaky PiperOrigin-RevId: 207642348 Change-Id: Iacec9f097ab93b91c0c8eea61b1347e864f57a8b	2018-08-06 18:08:25 -07:00
Fabricio Voznika	bc9a1fca23	Tiny reordering to network code PiperOrigin-RevId: 207581723 Change-Id: I6e4eb1227b5ed302de5e6c891040b670955f1eea	2018-08-06 11:48:29 -07:00
Fabricio Voznika	4c1167de4e	Isolate image pulling time from container startup mysql image test is timing out sporadically and it's hard to tell where the slow down in coming from. PiperOrigin-RevId: 207147237 Change-Id: I05a4d2c116292695d63cf861f3b89cd1c54b6106	2018-08-02 12:42:07 -07:00
Ian Gudger	3cd7824410	Move stack clock to options struct PiperOrigin-RevId: 207039273 Change-Id: Ib8f55a6dc302052ab4a10ccd70b07f0d73b373df	2018-08-01 20:22:02 -07:00
Fabricio Voznika	413bfb39a9	Use backoff package for retry logic PiperOrigin-RevId: 206834838 Change-Id: I9a44c6fa5f4766a01f86e90810f025cefecdf2d4	2018-07-31 15:07:53 -07:00
Michael Pratt	6cad96f38a	Drop dup2 filter It is unused. PiperOrigin-RevId: 206798328 Change-Id: I2d7d27c0e4a0ef51264b900f14f1b3fdad17f2c4	2018-07-31 11:38:57 -07:00
Brielle Broder	543c997978	Cleans up files created if there is a failure. PiperOrigin-RevId: 206674267 Change-Id: Ifc4eb19e0882e8bed566e9c553af910925fe6ae2	2018-07-30 17:18:02 -07:00
Adin Scannell	3188859742	Make runsc visibility public. (Why not?) PiperOrigin-RevId: 206401282 Change-Id: Iadcb7fb8472de7aef7c4bf5182e9a1d339e4d259	2018-07-27 17:57:42 -07:00
Fabricio Voznika	b8f96a9d0b	Replace sleeps with waits in tests - part II PiperOrigin-RevId: 206333130 Change-Id: Ic85874dbd53c5de2164a7bb75769d52d43666c2a	2018-07-27 10:10:10 -07:00
Fabricio Voznika	e5adf42f66	Replace sleeps with waits in tests - part I PiperOrigin-RevId: 206084473 Change-Id: I44e1b64b9cdd2964357799dca27cc0cbc19ce07d	2018-07-25 17:37:53 -07:00
Nicolas Lacasse	1129b35c92	runsc: Fix "exec" command when called without --pid-file. When "exec" command is called without the "--detach" flag, we spawn a second "exec" command and wait for that one to start. We use the pid file passed in --pid-file to detect when this second command has started running. However if "exec" is called with no --pid-file flag, this system breaks down, as we don't have a pid file to wait for. This CL ensures that the second instance of the "exec" command always writes a pid-file, so the wait is successful. PiperOrigin-RevId: 206002403 Change-Id: If9f2be31eb6e831734b1b833f25054ec71ab94a6	2018-07-25 09:11:45 -07:00
Justine Olshan	b5113574fe	Created a docker integration test for a tomcat image. PiperOrigin-RevId: 205718733 Change-Id: I200b23af064d256f157baf9da5005ab16cc55928	2018-07-23 13:55:28 -07:00
Fabricio Voznika	d7a34790a0	Add KVM and overlay dimensions to container_test PiperOrigin-RevId: 205714667 Change-Id: I317a2ca98ac3bdad97c4790fcc61b004757d99ef	2018-07-23 13:31:42 -07:00
Justine Olshan	f543ada150	Removed a now incorrect reference to restoreFile. PiperOrigin-RevId: 205470108 Change-Id: I226878a887fe1133561005357a9e3b09428b06b6	2018-07-20 16:18:07 -07:00
Lantao Liu	f62d6dd453	runsc: copy gateway from the pod network interface. PiperOrigin-RevId: 205334841 Change-Id: Ia60d486f9aae70182fdc4af50cf7c915986126d7	2018-07-19 18:09:56 -07:00
Justine Olshan	c05660373e	Moved restore code out of create and made to be called after create. Docker expects containers to be created before they are restored. However, gVisor restoring requires specificactions regarding the kernel and the file system. These actions were originally in booting the sandbox. Now setting up the file system is deferred until a call to a call to runsc start. In the restore case, the kernel is destroyed and a new kernel is created in the same process, as we need the same process for Docker. These changes required careful execution of concurrent processes which required the use of a channel. Full docker integration still needs the ability to restore into the same container. PiperOrigin-RevId: 205161441 Change-Id: Ie1d2304ead7e06855319d5dc310678f701bd099f	2018-07-18 16:58:30 -07:00
Nicolas Lacasse	e5d8f99c60	runsc: Fixes to CheckpointRestoreTest. We must delete the output file at the beginning of the test, otherwise the test fails immediately. Also some minor cleanups in readOutputFile. PiperOrigin-RevId: 205150525 Change-Id: I6bae1acd5b315320a2c6e25a59afcfc06267fb17	2018-07-18 15:46:37 -07:00
Nicolas Lacasse	9059983fdb	runsc: Fix map access race in boot.Loader.waitContainer. PiperOrigin-RevId: 204522004 Change-Id: I4819dc025f0a1df03ceaaba7951b1902d44562b3	2018-07-13 13:46:14 -07:00
Nicolas Lacasse	6dce46d4c0	Bump the timeout when waiting for python HTTP server. PiperOrigin-RevId: 204511630 Change-Id: Ib841a7144f3833321b0e69b8585b03c4ed55a265	2018-07-13 12:34:04 -07:00
Nicolas Lacasse	67507bd579	runsc: Don't close the control server in a defer. Closing the control server will block until all open requests have completed. If a control server method panics, we end up stuck because the defer'd Destroy function will never return. PiperOrigin-RevId: 204354676 Change-Id: I6bb1d84b31242d7c3f20d5334b1c966bd6a61dbf	2018-07-12 13:36:57 -07:00
Bhasker Hariharan	c15cb8d432	Automated rollback of changelist 203157739 PiperOrigin-RevId: 204196916 Change-Id: If632750fc6368acb835e22cfcee0ae55c8a04d16	2018-07-11 15:07:19 -07:00
Justine Olshan	81ae5f3df5	Created runsc and docker integration tests. Moved some of the docker image functions to testutil.go. Test runsc commands create, start, stop, pause, and resume. PiperOrigin-RevId: 204138452 Change-Id: Id00bc58d2ad230db5e9e905eed942187e68e7c7b	2018-07-11 09:37:28 -07:00
Brielle Broder	b763b3992a	Modified error message for clarity. Previously, error message only showed "<nil>" when child and pid were the same (since no error is returned by the Wait4 syscall in this case) which occurs when the process has incorrectly terminated. A new error message was added to improve clarity for such a case. Tests for this function were modified to reflect the improved distinction between process termination and error. PiperOrigin-RevId: 204018107 Change-Id: Ib38481c9590405e5bafcb6efe27fd49b3948910c	2018-07-10 14:58:12 -07:00
Justine Olshan	f107a5b1a0	Tests pause and resume functionality on a Python container. PiperOrigin-RevId: 203488336 Change-Id: I55e1b646f1fae73c27a49e064875d55f5605b200	2018-07-06 09:39:01 -07:00
Michael Pratt	660f1203ff	Fix runsc VDSO mapping `80bdf8a406` accidentally moved vdso into an inner scope, never assigning the vdso variable passed to the Kernel and thus skipping VDSO mappings. Fix this and remove the ability for loadVDSO to skip VDSO mappings, since tests that do so are gone. PiperOrigin-RevId: 203169135 Change-Id: Ifd8cadcbaf82f959223c501edcc4d83d05327eba	2018-07-03 12:53:39 -07:00
Fabricio Voznika	52ddb8571c	Skip overlay on root when its readonly PiperOrigin-RevId: 203161098 Change-Id: Ia1904420cb3ee830899d24a4fe418bba6533be64	2018-07-03 12:01:09 -07:00
Lantao Liu	138cb8da50	runsc: `runsc wait` print wait status. PiperOrigin-RevId: 203160639 Change-Id: I8fb2787ba0efb7eacd9d4c934238a26eb5ae79d5	2018-07-03 11:58:12 -07:00
Fabricio Voznika	0ef6066167	Resend packets back to netstack if destined to itself Add option to redirect packet back to netstack if it's destined to itself. This fixes the problem where connecting to the local NIC address would not work, e.g.: echo bar \| nc -l -p 8080 & echo foo \| nc 192.168.0.2 8080 PiperOrigin-RevId: 203157739 Change-Id: I31c9f7c501e3f55007f25e1852c27893a16ac6c4	2018-07-03 11:39:17 -07:00
Fabricio Voznika	c1b4c1ffee	Fix flaky image_test - Some failures were being ignored in run_tests.sh - Give more time for mysql to setup - Fix typo with network=host tests - Change httpd test to wait on http server being available, not only output PiperOrigin-RevId: 203156896 Change-Id: Ie1801dcd76e9b5fe4722c4d8695c76e40988dd74	2018-07-03 11:34:15 -07:00
Nicolas Lacasse	4500155ffc	runsc: Mount "mandatory" mounts right after mounting the root. The /proc and /sys mounts are "mandatory" in the sense that they should be mounted in the sandbox even when they are not included in the spec. Runsc treats /tmp similarly, because it is faster to use the internal tmpfs implementation instead of proxying to the host. However, the spec may contain submounts of these mandatory mounts (particularly for /tmp). In those cases, we must mount our mandatory mounts before the submount, otherwise the submount will be masked. Since the mandatory mounts are all top-level directories, we can mount them right after the root. PiperOrigin-RevId: 203145635 Change-Id: Id69bae771d32c1a5b67e08c8131b73d9b42b2fbf	2018-07-03 10:36:22 -07:00
Dmitry Vyukov	6144751962	runsc/boot/filter: permit SYS_TIME for race glibc's malloc also uses SYS_TIME. Permit it. #0 0x0000000000de6267 in time () #1 0x0000000000db19d8 in get_nprocs () #2 0x0000000000d8a31a in arena_get2.part () #3 0x0000000000d8ab4a in malloc () #4 0x0000000000d3c6b5 in __sanitizer::InternalAlloc(unsigned long, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<0ul, 140737488355328ull, 0ul, __sanitizer::SizeClassMap<3ul, 4ul, 8ul, 17ul, 64ul, 14ul>, 20ul, __sanitizer::TwoLevelByteMap<32768ull, 4096ull, __sanitizer::NoOpMapUnmapCallback>, __sanitizer::NoOpMapUnmapCallback> >*, unsigned long) () #5 0x0000000000d4cd70 in __tsan_go_start () #6 0x00000000004617a3 in racecall () #7 0x00000000010f4ea0 in runtime.findfunctab () #8 0x000000000043f193 in runtime.racegostart () Signed-off-by: Dmitry Vyukov <dvyukov@google.com> [mpratt@google.com: updated comments and commit message] Signed-off-by: Michael Pratt <mpratt@google.com> Change-Id: Ibe2d0dc3035bf5052d5fb802cfaa37c5e0e7a09a PiperOrigin-RevId: 203042627	2018-07-02 17:47:32 -07:00
Lantao Liu	126296ce2a	runsc: fix panic for `runsc wait` on stopped container. PiperOrigin-RevId: 203016694 Change-Id: Ic51ef754aa6d7d1b3b35491aff96a63d7992e122	2018-07-02 14:52:21 -07:00
Fabricio Voznika	fa64c2a151	Make default limits the same as with runc Closes #2 PiperOrigin-RevId: 202997196 Change-Id: I0c9f6f5a8a1abe1ae427bca5f590bdf9f82a6675	2018-07-02 12:51:38 -07:00
Brielle Broder	ca353b53ed	Fix typo. PiperOrigin-RevId: 202720658 Change-Id: Iff42fd23f831ee7f29ddd6eb867020b76ed1eb23	2018-06-29 15:51:32 -07:00
Justine Olshan	80bdf8a406	Sets the restore environment for restoring a container. Updated how restoring occurs through boot.go with a separate Restore function. This prevents a new process and new mounts from being created. Added tests to ensure the container is restored. Registered checkpoint and restore commands so they can be used. Docker support for these commands is still limited. Working on #80. PiperOrigin-RevId: 202710950 Change-Id: I2b893ceaef6b9442b1ce3743bd112383cb92af0c	2018-06-29 14:47:40 -07:00
Brielle Broder	25e315c2e1	Added leave-running flag for checkpoint. The leave-running flag allows the container to continue running after a checkpoint has occurred by doing an immediate restore into a new container with the same container ID after the old container is destroyed. Updates #80. PiperOrigin-RevId: 202695426 Change-Id: Iac50437f5afda018dc18b24bb8ddb935983cf336	2018-06-29 13:09:33 -07:00
Kevin Krakauer	16d37973eb	runsc: Add the "wait" subcommand. Users can now call "runsc wait <container id>" to wait on a particular process inside the container. -pid can also be used to wait on a specific PID. Manually tested the wait subcommand for a single waiter and multiple waiters (simultaneously 2 processes waiting on the container and 2 processes waiting on a PID within the container). PiperOrigin-RevId: 202548978 Change-Id: Idd507c2cdea613c3a14879b51cfb0f7ea3fb3d4c	2018-06-28 14:56:36 -07:00
Fabricio Voznika	5a8e014c3d	Add more image tests PiperOrigin-RevId: 202537696 Change-Id: I900fe8fd36cc7a4edb44fe2d03f8ba6768db53cb	2018-06-28 13:54:04 -07:00
Fabricio Voznika	bb31a11903	Wait for sandbox process when waiting for root container Closes #71 PiperOrigin-RevId: 202532762 Change-Id: I80a446ff638672ff08e6fd853cd77e28dd05d540	2018-06-28 13:23:04 -07:00
Fabricio Voznika	8459390cdd	Error out if spec is invalid Closes #66 PiperOrigin-RevId: 202496258 Change-Id: Ib9287c5bf1279ffba1db21ebd9e6b59305cddf34	2018-06-28 09:57:27 -07:00
Fabricio Voznika	1f207de315	Add option to configure watchdog action PiperOrigin-RevId: 202494747 Change-Id: I4d4a18e71468690b785060e580a5f83c616bd90f	2018-06-28 09:46:50 -07:00
Brielle Broder	f93043615f	Added MkdirAll capabilities for Checkpoint's image-path. Now able to save the state file (checkpoint.img) at an image-path that had previously not existed. This is important because there can only be one checkpoint.img file per directory so this will enable users to create as many directories as needed for proper organization. PiperOrigin-RevId: 202360414 Change-Id: If5dd2b72e08ab52834a2b605571186d107b64526	2018-06-27 13:32:53 -07:00
Fabricio Voznika	c186e408cc	Add KVM, overlay and host network to image tests PiperOrigin-RevId: 202236006 Change-Id: I4ea964a70fc49e8b51c9da27d77301c4eadaae71	2018-06-26 19:05:50 -07:00
Lantao Liu	000fd8d1e4	runsc: set gofer umask to 0. PiperOrigin-RevId: 202185642 Change-Id: I2eefcc0b2ffadc6ef21d177a8a4ab0cda91f3399	2018-06-26 13:40:04 -07:00
Lantao Liu	e8ae2b85e9	runsc: add a `multi-container` flag to enable multi-container support. PiperOrigin-RevId: 201995800 Change-Id: I770190d135e14ec7da4b3155009fe10121b2a502	2018-06-25 12:08:44 -07:00
Fabricio Voznika	cecc1e472c	Fix lint errors PiperOrigin-RevId: 201978212 Change-Id: Ie3df1fd41d5293fff66b546a0c68c3bf98126067	2018-06-25 10:41:27 -07:00
Kevin Krakauer	04bdcc7b65	runsc: Enable waiting on individual containers within a sandbox. PiperOrigin-RevId: 201742160 Change-Id: Ia9fa1442287c5f9e1196fb117c41536a80f6bb31	2018-06-22 14:31:25 -07:00
Brielle Broder	e1aee51d09	Modified Checkpoint/Restore flags to improve compatibility with Docker. Added a number of unimplemented flags required for using runsc's Checkpoint and Restore with Docker. Modified the "image-path" flag to require a directory instead of a file. PiperOrigin-RevId: 201697486 Change-Id: I55883df2f1bbc3ec3c395e0ca160ce189e5e7eba	2018-06-22 09:41:26 -07:00
Fabricio Voznika	f6be5fe619	Forward SIGUSR2 to the sandbox too SIGUSR2 was being masked out to be used as a way to dump sentry stacks. This could cause compatibility problems in cases anyone uses SIGUSR2 to communicate with the container init process. PiperOrigin-RevId: 201575374 Change-Id: I312246e828f38ad059139bb45b8addc2ed055d74	2018-06-21 13:22:18 -07:00
Justine Olshan	f2a687001d	Added functionality to create a RestoreEnvironment. Before a container can be restored, the mounts must be configured. The root and submounts and their key information is compiled into a RestoreEnvironment. Future code will be added to set this created environment before restoring a container. Tests to ensure the correct environment were added. PiperOrigin-RevId: 201544637 Change-Id: Ia894a8b0f80f31104d1c732e113b1d65a4697087	2018-06-21 10:18:11 -07:00
Brielle Broder	7d6149063a	Restore implementation added to runsc. Restore creates a new container and uses the given image-path to load a saved image of a previous container. Restore command is plumbed through container and sandbox. This command does not work yet - more to come. PiperOrigin-RevId: 201541229 Change-Id: I864a14c799ce3717d99bcdaaebc764281863d06f	2018-06-21 09:58:24 -07:00
Nicolas Lacasse	81d13fbd4d	runsc: Default umask should be 0. PiperOrigin-RevId: 201539050 Change-Id: I36cbf270fa5ad25de507ecb919e4005eda6aa16d	2018-06-21 09:43:15 -07:00
Ian Gudger	ef4f239c79	Fix typo in runsc gofer flag description PiperOrigin-RevId: 201529295 Change-Id: I55eb516ec6d14fbcd48593a3d61f724adc253a23	2018-06-21 08:34:51 -07:00
Fabricio Voznika	95cb01e0a9	Reduce test sleep time PiperOrigin-RevId: 201428433 Change-Id: I72de1e46788ec84f61513416bb690956e515907e	2018-06-20 15:32:15 -07:00
Fabricio Voznika	2f59ba0e2d	Include image test as part of kokoro tests PiperOrigin-RevId: 201427731 Change-Id: I5cbee383ec51c02b7892ec7812cbbdc426be8991	2018-06-20 15:28:12 -07:00
Fabricio Voznika	2b5bdb525e	Add end-to-end image tests PiperOrigin-RevId: 201418619 Change-Id: I7961b027394d98422642f829bc54745838c138bd	2018-06-20 14:38:45 -07:00
Fabricio Voznika	4ad7315b67	Add 'runsc debug' command It prints sandbox stacks to the log to help debug stuckness. I expect that many more options will be added in the future. PiperOrigin-RevId: 201405931 Change-Id: I87e560800cd5a5a7b210dc25a5661363c8c3a16e	2018-06-20 13:31:31 -07:00
Fabricio Voznika	af6f9f56f8	Add tool to configure runtime settings in docker This will be used with the upcoming e2e image tests. PiperOrigin-RevId: 201400832 Change-Id: I49509314e16ea54655ea8060dbf511a04a7a8f79	2018-06-20 13:01:16 -07:00
Kevin Krakauer	5397963b5d	runsc: Enable container creation within existing sandboxes. Containers are created as processes in the sandbox. Of the many things that don't work yet, the biggest issue is that the fsgofer is launched with its root as the sandbox's root directory. Thus, when a container is started and wants to read anything (including the init binary of the container), the gofer tries to serve from sandbox's root (which basically just has pause), not the container's. PiperOrigin-RevId: 201294560 Change-Id: I6423aa8830538959c56ae908ce067e4199d627b1	2018-06-19 21:44:33 -07:00
Kevin Krakauer	3ebd0e35f4	runsc: Whitelist lstat, as it is now used in specutils. When running multi-container, child containers are added after the filters have been installed. Thus, lstat must be in the set of allowed syscalls. PiperOrigin-RevId: 201269550 Change-Id: I03f2e6675a53d462ed12a0f651c10049b76d4c52	2018-06-19 17:17:41 -07:00
Kevin Krakauer	33f29c730f	runsc: Fix flakey container_test. Verified that this is no longer flakey over 10K repetitions. PiperOrigin-RevId: 201267499 Change-Id: I793c916fe725412aec25953f764cb4f52c9fbed3	2018-06-19 17:04:51 -07:00
Justine Olshan	a6dbef045f	Added a resume command to unpause a paused container. Resume checks the status of the container and unpauses the kernel if its status is paused. Otherwise nothing happens. Tests were added to ensure that the process is in the correct state after various commands. PiperOrigin-RevId: 201251234 Change-Id: Ifd11b336c33b654fea6238738f864fcf2bf81e19	2018-06-19 15:23:36 -07:00
Justine Olshan	873ec0c414	Modified boot.go to allow for restores. A file descriptor was added as a flag to boot so a state file can restore a container that was checkpointed. PiperOrigin-RevId: 201068699 Change-Id: I18e96069488ffa3add468861397f3877725544aa	2018-06-18 15:20:36 -07:00
Lantao Liu	f3727528e5	runsc: support symlink to the exec path. PiperOrigin-RevId: 201049912 Change-Id: Idd937492217a4c2ca3d59c602e41576a3b203dd9	2018-06-18 13:37:59 -07:00
Lantao Liu	821aaf531d	runsc: support "rw" mount option. PiperOrigin-RevId: 201018483 Change-Id: I52fe3d01c83c8a2f0e9275d9d88c37e46fa224a2	2018-06-18 10:34:11 -07:00
Fabricio Voznika	775982ed4b	Automated rollback of changelist 200770591 PiperOrigin-RevId: 201012131 Change-Id: I5cd69e795555129319eb41135ecf26db9a0b1fcb	2018-06-18 10:00:04 -07:00
Justine Olshan	0786707cd9	Added code for a pause command for a container process. Like runc, the pause command will pause the processes of the given container. It will set that container's status to "paused." A resume command will be be added to unpause and continue running the process. PiperOrigin-RevId: 200789624 Change-Id: I72a5d7813d90ecfc4d01cc252d6018855016b1ea	2018-06-15 16:09:09 -07:00
Kevin Krakauer	437890dc4b	runsc: Make gofer logs show up in test output. PiperOrigin-RevId: 200770591 Change-Id: Ifc096d88615b63135210d93c2b4cee2eaecf1eee	2018-06-15 14:07:54 -07:00
Lantao Liu	2081c5e7f7	runsc: support /dev bind mount which does not conflict with default /dev mount. PiperOrigin-RevId: 200768923 Change-Id: I4b8da10bcac296e8171fe6754abec5aabfec5e65	2018-06-15 13:58:39 -07:00
Dmitry Vyukov	52110bfc33	runsc/cmd: fix kill signal parsing Signal is arg 1, not 2. Killing with SIGABRT is useful to get Go traces. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Change-Id: I0b78e34a9de3fb3385108e26fdb4ff6e9347aeff PiperOrigin-RevId: 200742743	2018-06-15 11:06:07 -07:00
Fabricio Voznika	ef5dd4df9b	Set kernel.applicationCores to the number of processor on the host The right number to use is the number of processors assigned to the cgroup. But until we make the sandbox join the respective cgroup, just use the number of processors on the host. Closes #65, closes #66 PiperOrigin-RevId: 200725483 Change-Id: I34a566b1a872e26c66f56fa6e3100f42aaf802b1	2018-06-15 09:19:04 -07:00
Brielle Broder	bd1e83ff60	Fix typo. PiperOrigin-RevId: 200631795 Change-Id: I297fe3e30fb06b04fccd8358c933e45019dcc1fa	2018-06-14 15:45:10 -07:00
Michael Pratt	d71f5ef688	Add nanosleep filter for Go 1.11 support golang.org/cl/108538 replaces pselect6 with nanosleep in runtime.usleep. Update the filters accordingly. PiperOrigin-RevId: 200574612 Change-Id: Ifb2296fcb3781518fc047aabbbffedb9ae488cd7	2018-06-14 10:11:05 -07:00
Fabricio Voznika	717f2501c9	Fix failure to mount volume that sandbox process has no access Boot loader tries to stat mount to determine whether it's a file or not. This may file if the sandbox process doesn't have access to the file. Instead, add overlay on top of file, which is better anyway since we don't want to propagate changes to the host. PiperOrigin-RevId: 200411261 Change-Id: I14222410e8bc00ed037b779a1883d503843ffebb	2018-06-13 10:20:06 -07:00
Lantao Liu	2506b9b11f	runsc: do not include sub target if it is not started with '/'. PiperOrigin-RevId: 200274828 Change-Id: I956703217df08d8650a881479b7ade8f9f119912	2018-06-12 13:54:54 -07:00
Brielle Broder	711a9869e5	Runsc checkpoint works. This is the first iteration of checkpoint that actually saves to a file. Tests for checkpoint are included. Ran into an issue when private unix sockets are enabled. An error message was added for this case and the mutex state was set. PiperOrigin-RevId: 200269470 Change-Id: I28d29a9f92c44bf73dc4a4b12ae0509ee4070e93	2018-06-12 13:25:23 -07:00
Kevin Krakauer	2dc9cd7bf7	runsc: enable terminals in the sandbox. runsc now mounts the devpts filesystem, so you get a real terminal using ssh+sshd. PiperOrigin-RevId: 200244830 Change-Id: If577c805ad0138fda13103210fa47178d8ac6605	2018-06-12 11:03:25 -07:00
Fabricio Voznika	48335318a2	Enable debug logging in tests Unit tests call runsc directly now, so all command line arguments are valid. On the other hand, enabling debug in the test binary doesn't affect runsc. It needs to be set in the config. PiperOrigin-RevId: 200237706 Change-Id: I0b5922db17f887f58192dbc2f8dd2fd058b76ec7	2018-06-12 10:25:55 -07:00
Fabricio Voznika	5c51bc51e4	Drop capabilities not needed by Gofer PiperOrigin-RevId: 199808391 Change-Id: Ib37a4fb6193dc85c1f93bc16769d6aa41854b9d4	2018-06-08 09:59:26 -07:00
Kevin Krakauer	206e90d057	runsc: Support abbreviated container IDs. Just a UI/usability addition. It's a lot easier to type "60" than "60185c721d7e10c00489f1fa210ee0d35c594873d6376b457fb1815e4fdbfc2c". PiperOrigin-RevId: 199547932 Change-Id: I19011b5061a88aba48a9ad7f8cf954a6782de854	2018-06-06 16:13:53 -07:00
Googler	0c34b460f2	Add runsc checkpoint command. Checkpoint command is plumbed through container and sandbox. Restore has also been added but it is only a stub. None of this works yet. More changes to come. PiperOrigin-RevId: 199510105 Change-Id: Ibd08d57f4737847eb25ca20b114518e487320185	2018-06-06 12:31:53 -07:00
Googler	722275c3d1	Added a function to the controller to checkpoint a container. Functionality for checkpoint is not complete, more to come. PiperOrigin-RevId: 199500803 Change-Id: Iafb0fcde68c584270000fea898e6657a592466f7	2018-06-06 11:43:55 -07:00
Fabricio Voznika	19a0e83b50	Make fsgofer attach more strict Refuse to mount paths with "." and ".." in the path to prevent a compromised Sentry to mount "../../secrets". Only allow Attach to be called once per mount point. PiperOrigin-RevId: 199225929 Change-Id: I2a3eb7ea0b23f22eb8dde2e383e32563ec003bd5	2018-06-04 18:04:54 -07:00
Fabricio Voznika	6c585b8eb6	Create destination mount dir if it doesn't exist PiperOrigin-RevId: 199175296 Change-Id: I694ad1cfa65572c92f77f22421fdcac818f44630	2018-06-04 12:31:35 -07:00
Fabricio Voznika	78ccd1298e	Return 'running' if gofer is still alive Containerd will start deleting container and rootfs after container is stopped. However, if gofer is still running, rootfs cleanup will fail because of device busy. This CL makes sure that gofer is not running when container state is stopped. Change from: lantaol@google.com PiperOrigin-RevId: 199172668 Change-Id: I9d874eec3ecf74fd9c8edd7f62d9f998edef66fe	2018-06-04 12:14:23 -07:00
Fabricio Voznika	55a37ceef1	Fix leaky FD 9P socket was being created without CLOEXEC and was being inherited by the children. This would prevent the gofer from detecting that the sandbox had exited, because the socket would not be closed. PiperOrigin-RevId: 199168959 Change-Id: I3ee1a07cbe7331b0aeb1cf2b697e728ce24f85a7	2018-06-04 11:52:17 -07:00
Fabricio Voznika	a0e2126be4	Refactor container_test in preparation for sandbox_test Common code to setup and run sandbox is moved to testutil. Also, don't link "boot" and "gofer" commands with test binary. Instead, use runsc binary from the build. This not only make the test setup simpler, but also resolves a dependency issue with sandbox_tests not depending on container package. PiperOrigin-RevId: 199164478 Change-Id: I27226286ca3f914d4d381358270dd7d70ee8372f	2018-06-04 11:26:30 -07:00
Zhengyu He	d1ca50d49e	Add SyscallRules that supports argument filtering PiperOrigin-RevId: 198919043 Change-Id: I7f1f0a3b3430cd0936a4ee4fc6859aab71820bdf	2018-06-01 13:40:52 -07:00
Fabricio Voznika	65dadc0029	Ignores IPv6 addresses when configuring network Closes #60 PiperOrigin-RevId: 198887885 Change-Id: I9bf990ee3fde9259836e57d67257bef5b85c6008	2018-06-01 10:09:37 -07:00
Fabricio Voznika	812e83d3bb	Supress error when deleting non-existing container with --force This addresses the first issue reported in #59. CRI-O expects runsc to return success to delete when --force is used with a non-existing container. PiperOrigin-RevId: 198487418 Change-Id: If7660e8fdab1eb29549d0a7a45ea82e20a1d4f4a	2018-05-29 17:58:12 -07:00
Fabricio Voznika	e48f707876	Configure sandbox as superuser Container user might not have enough priviledge to walk directories and mount filesystems. Instead, create superuser to perform these steps of the configuration. PiperOrigin-RevId: 197953667 Change-Id: I643650ab654e665408e2af1b8e2f2aa12d58d4fb	2018-05-24 14:27:57 -07:00
Fabricio Voznika	ed2b86a549	Fix test failure when user can't mount temp dir PiperOrigin-RevId: 197491098 Change-Id: Ifb75bd4e4f41b84256b6d7afc4b157f6ce3839f3	2018-05-21 17:48:04 -07:00
Rahat Mahmood	8878a66a56	Implement sysv shm. PiperOrigin-RevId: 197058289 Change-Id: I3946c25028b7e032be4894d61acb48ac0c24d574	2018-05-17 15:06:19 -07:00
Nicolas Lacasse	31386185fe	Push signal-delivery and wait into the sandbox. This is another step towards multi-container support. Previously, we delivered signals directly to the sandbox process (which then forwarded the signal to PID 1 inside the sandbox). Similarly, we waited on a container by waiting on the sandbox process itself. This approach will not work when there are multiple containers inside the sandbox, and we need to signal/wait on individual containers. This CL adds two new messages, ContainerSignal and ContainerWait. These messages include the id of the container to signal/wait. The controller inside the sandbox receives these messages and signals/waits on the appropriate process inside the sandbox. The container id is plumbed into the sandbox, but it currently is not used. We still end up signaling/waiting on PID 1 in all cases. Once we actually have multiple containers inside the sandbox, we will need to keep some sort of map of container id -> pid (or possibly pid namespace), and signal/kill the appropriate process for the container. PiperOrigin-RevId: 197028366 Change-Id: I07b4d5dc91ecd2affc1447e6b4bdd6b0b7360895	2018-05-17 11:55:28 -07:00
Nicolas Lacasse	205f1027e6	Refactor the Sandbox package into Sandbox + Container. This is a necessary prerequisite for supporting multiple containers in a single sandbox. All the commands (in cmd package) now call operations on Containers (container package). When a Container first starts, it will create a Sandbox with the same ID. The Sandbox class is now simpler, as it only knows how to create boot/gofer processes, and how to forward commands into the running boot process. There are TODOs sprinkled around for additional support for multiple containers. Most notably, we need to detect when a container is intended to run in an existing sandbox (by reading the metadata), and then have some way to signal to the sandbox to start a new container. Other urpc calls into the sandbox need to pass the container ID, so the sandbox can run the operation on the given container. These are only half-plummed through right now. PiperOrigin-RevId: 196688269 Change-Id: I1ecf4abbb9dd8987a53ae509df19341aaf42b5b0	2018-05-15 10:18:03 -07:00
Fabricio Voznika	7cff8489de	Fix failure to rename directory os.Rename validates that the target doesn't exist, which is different from syscall.Rename which replace the target if both are directories. fsgofer needs the syscall behavior. PiperOrigin-RevId: 196194630 Change-Id: I87d08cad88b5ef310b245cd91647c4f5194159d8	2018-05-10 17:13:10 -07:00
Chanwit Kaewkasi	7b6111b695	Display the current git revision in the info block Change-Id: I9737cc680968033ba82c95bb04cc482fcaa12642 PiperOrigin-RevId: 196192683	2018-05-10 16:57:41 -07:00
Fabricio Voznika	ac01f245ff	Skip atime and mtime update when file is backed by host FD When file is backed by host FD, atime and mtime for the host file and the cached attributes in the Sentry must be close together. In this case, the call to update atime and mtime can be skipped. This is important when host filesystem is using overlay because updating atime and mtime explicitly forces a copy up for every file that is touched. PiperOrigin-RevId: 196176413 Change-Id: I3933ea91637a071ba2ea9db9d8ac7cdba5dc0482	2018-05-10 14:59:40 -07:00
Fabricio Voznika	5a509c47a2	Open file as read-write when mount points to a file This is to allow files mapped directly, like /etc/hosts, to be writable. Closes #40 PiperOrigin-RevId: 196155920 Change-Id: Id2027e421cef5f94a0951c3e18b398a77c285bbd	2018-05-10 12:38:36 -07:00
Nicolas Lacasse	1bdec86bae	Return better errors from Docker when runsc fails to start. Two changes in this CL: First, make the "boot" process sleep when it encounters an error to give the controller time to send the error back to the "start" process. Otherwise the "boot" process exits immediately and the control connection errors with EOF. Secondly, open the log file with O_APPEND, not O_TRUNC. Docker uses the same log file for all runtime commands, and setting O_TRUNC causes them to get destroyed. Furthermore, containerd parses these log files in the event of an error, and it does not like the file being truncated out from underneath it. Now, when trying to run a binary that does not exist in the image, the error message is more reasonable: $ docker run alpine /not/found docker: Error response from daemon: OCI runtime start failed: /usr/local/google/docker/runtimes/runscd did not terminate sucessfully: error starting sandbox: error starting application [/not/found]: failed to create init process: no such file or directory Fixes #32 PiperOrigin-RevId: 196027084 Change-Id: Iabc24c0bdd8fc327237acc051a1655515f445e68	2018-05-09 14:13:37 -07:00
Nicolas Lacasse	32cabad8da	Use the containerd annotation instead of detecting the "pause" application. FIXED=72380268 PiperOrigin-RevId: 195846596 Change-Id: Ic87fed1433482a514631e1e72f5ee208e11290d1	2018-05-08 11:11:50 -07:00
Fabricio Voznika	e1b412d660	Error if container requires AppArmor, SELinux or seccomp Closes #35 PiperOrigin-RevId: 195840128 Change-Id: I31c1ad9b51ec53abb6f0b485d35622d4e9764b29	2018-05-08 10:34:11 -07:00
Ian Gudger	7c8c3705ea	Fix misspellings PiperOrigin-RevId: 195742598 Change-Id: Ibd4a8e4394e268c87700b6d1e50b4b37dfce5182	2018-05-07 16:38:01 -07:00
Ian Gudger	f47174f06b	Run gofmt -s on everything PiperOrigin-RevId: 195469901 Change-Id: I66d5c7a334bbb8b47e40d266a2661291c2d91c7f	2018-05-04 14:16:11 -07:00
Fabricio Voznika	c90fefc116	Fix runsc capabilities There was a typo and one new capability missing from the list PiperOrigin-RevId: 195427713 Change-Id: I6d9e1c6e77b48fe85ef10d9f54c70c8a7271f6e7	2018-05-04 09:39:28 -07:00

... 2 3 4 5 6 ...

357 Commits