gvisor

Commit Graph

Author	SHA1	Message	Date
Bhasker Hariharan	e0b3d3323f	Add support for using PACKET_RX_RING to receive packets. PACKET_RX_RING allows the use of an mmapped buffer to receive packets from the kernel. This should cut down the number of host syscalls that need to be made to receive packets when the underlying fd is a socket of the AF_PACKET type. PiperOrigin-RevId: 233834998 Change-Id: I8060025c6ced206986e94cc46b8f382b81bfa47f	2019-02-13 14:53:03 -08:00
Nicolas Lacasse	92e85623a0	Factor the subtargets method into a helper method with tests. PiperOrigin-RevId: 232047515 Change-Id: I00f036816e320356219be7b2f2e6d5fe57583a60	2019-02-01 15:23:43 -08:00
Andrei Vagin	4e695adcd0	gvisor/gofer: Use pivot_root instead of chroot PiperOrigin-RevId: 231864273 Change-Id: I8545b72b615f5c2945df374b801b80be64ec3e13	2019-01-31 15:19:04 -08:00
Michael Pratt	2a0c69b19f	Remove license comments Nothing reads them and they can simply get stale. Generated with: $ sed -i "s/licenses($.$)./licenses(\1)/" **/BUILD PiperOrigin-RevId: 231818945 Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0	2019-01-31 11:12:53 -08:00
Andrei Vagin	7e8a56087b	runsc: check whether a container is deleted or not before setupContainerFS PiperOrigin-RevId: 231811387 Change-Id: Ib143fb9a4d0fa1f105d1a3a3bd533dfc44e792af	2019-01-31 10:34:15 -08:00
Andrei Vagin	dd577f5410	runsc: reap a sandbox process only in sandbox.Wait() PiperOrigin-RevId: 231504064 Change-Id: I585b769aef04a3ad7e7936027958910a6eed9c8d	2019-01-29 17:15:56 -08:00
Bhasker Hariharan	24cb2c0a72	Use recvmmsg() instead of readv() to read packets from NIC. This should reduce the number of syscalls required to process packets significantly and improve throughputs. PiperOrigin-RevId: 231366886 Change-Id: I8b38077262bf9c53176bc4a94b530188d3d7c0ca	2019-01-29 01:39:01 -08:00
Shijiang Wei	b44699c529	check isRootNS by ns inode Signed-off-by: Shijiang Wei <mountkin@gmail.com> Change-Id: I032f834edae5c716fb2d3538285eec07aa11a902 PiperOrigin-RevId: 231318438	2019-01-28 17:20:20 -08:00
Lantao Liu	52b3cd873d	runsc: Only uninstall cgroup for sandbox stop. PiperOrigin-RevId: 231263114 Change-Id: I57467a34fe94e395fdd3685462c4fe9776d040a3	2019-01-28 11:58:25 -08:00
Fabricio Voznika	55e8eb775b	Make cacheRemoteRevalidating detect changes to file size When file size changes outside the sandbox, page cache was not refreshing file size which is required for cacheRemoteRevalidating. In fact, cacheRemoteRevalidating should be skipping the cache completely since it's not really benefiting from it. The cache is cache is already bypassed for unstable attributes (see cachePolicy.cacheUAttrs). And althought the cache is called to map pages, they will always miss the cache and map directly from the host. Created a HostMappable struct that maps directly to the host and use it for files with cacheRemoteRevalidating. Closes #124 PiperOrigin-RevId: 230998440 Change-Id: Ic5f632eabe33b47241e05e98c95e9b2090ae08fc	2019-01-25 17:23:07 -08:00
ShiruRen	c6facd0358	Fix a nil pointer dereference bug in Container.Destroy() In Container.Destroy(), we call c.stop() before calling executeHooksBestEffort(), therefore, when we call executeHooksBestEffort(c.Spec.Hooks.Poststop, c.State()) to execute the poststop hook, it results in a nil pointer dereference since it reads c.Sandbox.Pid in c.State() after the sandbox has been destroyed. To fix this bug, we can change container's status to "stopped" before executing the poststop hook. Signed-off-by: ShiruRen <renshiru2000@gmail.com> Change-Id: I4d835e430066fab7e599e188f945291adfc521ef PiperOrigin-RevId: 230975505	2019-01-25 15:03:17 -08:00
Fabricio Voznika	c28f886c0b	Execute statically linked binary Mounting lib and lib64 are not necessary anymore and simplifies the test. PiperOrigin-RevId: 230971195 Change-Id: Ib91a3ffcec4b322cd3687c337eedbde9641685ed	2019-01-25 14:39:20 -08:00
Andrei Vagin	5f08f8fd81	Don't bind-mount runsc into a sandbox mntns PiperOrigin-RevId: 230437407 Change-Id: Id9d8ceeb018aad2fe317407c78c6ee0f4b47aa2b	2019-01-22 16:46:42 -08:00
Fabricio Voznika	c1be25b78d	Scrub runsc error messages Removed "error" and "failed to" prefix that don't add value from messages. Adjusted a few other messages. In particular, when the container fail to start, the message returned is easier for humans to read: $ docker run --rm --runtime=runsc alpine foobar docker: Error response from daemon: OCI runtime start failed: <path> did not terminate sucessfully: starting container: starting root container [foobar]: starting sandbox: searching for executable "foobar", cwd: "/", $PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin": no such file or directory Closes #77 PiperOrigin-RevId: 230022798 Change-Id: I83339017c70dae09e4f9f8e0ea2e554c4d5d5cd1	2019-01-18 17:36:02 -08:00
Andrei Vagin	c0a981629c	Start a sandbox process in a new userns only if CAP_SETUID is set In addition, it fixes a race condition in TestMultiContainerGoferStop. There are two scripts copy the same set of files into the same directory and sometime one of this command fails with EXIST. PiperOrigin-RevId: 230011247 Change-Id: I9289f72e65dc407cdcd0e6cd632a509e01f43e9c	2019-01-18 16:08:39 -08:00
Andrei Vagin	c063a1350f	runsc: create a new proc mount if the sandbox process is running in a new pidns PiperOrigin-RevId: 229971902 Change-Id: Ief4fac731e839ef092175908de9375d725eaa3aa	2019-01-18 12:17:34 -08:00
Fabricio Voznika	e4d3ca7263	Prevent internal tmpfs mount to override files in /tmp Runsc wants to mount /tmp using internal tmpfs implementation for performance. However, it risks hiding files that may exist under /tmp in case it's present in the container. Now, it only mounts over /tmp iff: - /tmp was not explicitly asked to be mounted - /tmp is empty If any of this is not true, then /tmp maps to the container's image /tmp. Note: checkpoint doesn't have sentry FS mounted to check if /tmp is empty. It simply looks for explicit mounts right now. PiperOrigin-RevId: 229607856 Change-Id: I10b6dae7ac157ef578efc4dfceb089f3b94cde06	2019-01-16 12:48:32 -08:00
Fabricio Voznika	92cf3764e0	Create working directory if it doesn't yet exist PiperOrigin-RevId: 229438125 Change-Id: I58eb0d10178d1adfc709d7b859189d1acbcb2f22	2019-01-15 14:13:27 -08:00
Nicolas Lacasse	dc8450b567	Remove fs.Handle, ramfs.Entry, and all the DeprecatedFileOperations. More helper structs have been added to the fsutil package to make it easier to implement fs.InodeOperations and fs.FileOperations. PiperOrigin-RevId: 229305982 Change-Id: Ib6f8d3862f4216745116857913dbfa351530223b	2019-01-14 20:34:28 -08:00
Andrei Vagin	a46b6d453d	runsc: set up a minimal chroot from the sandbox process In this case, new mounts are not created in the host mount namspaces, so tearDownChroot isn't needed, because chroot will be destroyed with a sandbox mount namespace. In additional, pivot_root can't be called instead of chroot. PiperOrigin-RevId: 229250871 Change-Id: I765bdb587d0b8287a6a8efda8747639d37c7e7b6	2019-01-14 14:08:19 -08:00
Andrei Vagin	f8c8f24154	runsc: Collect zombies of sandbox and gofer processes And we need to wait a gofer process before cgroup.Uninstall, because it is running in the sandbox cgroups. PiperOrigin-RevId: 228904020 Change-Id: Iaf8826d5b9626db32d4057a1c505a8d7daaeb8f9	2019-01-11 10:32:26 -08:00
Fabricio Voznika	0d7023d581	Restore to original cgroup after sandbox and gofer processes are created The original code assumed that it was safe to join and not restore cgroup, but Container.Run will not exit after calling start, making cgroup cleanup fail because there were still processes inside the cgroup. PiperOrigin-RevId: 228529199 Change-Id: I12a48d9adab4bbb02f20d71ec99598c336cbfe51	2019-01-09 09:18:15 -08:00
Fabricio Voznika	5ce542ecc7	Undo changes in case of failure to create file/dir/symlink File/dir/symlink creation is multi-step and may leave state behind in case of failure in one of the steps. Added best effort attempt to clean up. PiperOrigin-RevId: 228286612 Change-Id: Ib03c27cd3d3e4f44d0352edc6ee212a53412d7f1	2019-01-07 23:02:19 -08:00
Fabricio Voznika	d033a76fa6	Apply chroot for --network=host too PiperOrigin-RevId: 227747566 Change-Id: Ide9df4ac1391adcd1c56e08d6570e0d149d85bc4	2019-01-03 14:10:44 -08:00
Michael Pratt	33191e1cc4	Automated rollback of changelist 225089593 PiperOrigin-RevId: 227595007 Change-Id: If14cc5aab869c5fd7a4ebd95929c887ab690e94c	2019-01-02 15:48:00 -08:00
Fabricio Voznika	a891afad6d	Simplify synchronization between runsc and sandbox process Make 'runsc create' join cgroup before creating sandbox process. This removes the need to synchronize platform creation and ensure that sandbox process is charged to the right cgroup from the start. PiperOrigin-RevId: 227166451 Change-Id: Ieb4b18e6ca0daf7b331dc897699ca419bc5ee3a2	2018-12-28 13:48:24 -08:00
Jamie Liu	194ef586fc	Rename limits.MemoryPagesLocked to limits.MemoryLocked. "RLIMIT_MEMLOCK: This is the maximum number of bytes of memory that may be locked into RAM." - getrlimit(2) PiperOrigin-RevId: 226384346 Change-Id: Iefac4a1bb69f7714dc813b5b871226a8344dc800	2018-12-20 13:28:46 -08:00
Googler	86c9bd2547	Automated rollback of changelist 225861605 PiperOrigin-RevId: 226224230 Change-Id: Id24c7d3733722fd41d5fe74ef64e0ce8c68f0b12	2018-12-19 13:30:08 -08:00
Michael Pratt	b62591e6a8	Expose internal testing flag Never to used outside of runsc tests! PiperOrigin-RevId: 225919013 Change-Id: Ib3b14aa2a2564b5246fb3f8933d95e01027ed186	2018-12-17 17:35:06 -08:00
Jamie Liu	2421006426	Implement mlock(), kind of. Currently mlock() and friends do nothing whatsoever. However, mlocking is directly application-visible in a number of ways; for example, madvise(MADV_DONTNEED) and msync(MS_INVALIDATE) both fail on mlocked regions. We handle this inconsistently: MADV_DONTNEED is too important to not work, but MS_INVALIDATE is rejected. Change MM to track mlocked regions in a manner consistent with Linux. It still will not actually pin pages into host physical memory, but: - mlock() will now cause sentry memory management to precommit mlocked pages. - MADV_DONTNEED and MS_INVALIDATE will interact with mlocked pages as described above. PiperOrigin-RevId: 225861605 Change-Id: Iee187204979ac9a4d15d0e037c152c0902c8d0ee	2018-12-17 11:38:59 -08:00
Nicolas Lacasse	1775a0e11e	container.Destroy should clean up container metadata even if other cleanups fail If the sandbox process is dead (because of a panic or some other problem), container.Destroy will never remove the container metadata file, since it will always fail when calling container.stop(). This CL changes container.Destroy() to always perform the three necessary cleanup operations: * Stop the sandbox and gofer processes. * Remove the container fs on the host. * Delete the container metadata directory. Errors from these three operations will be concatenated and returned from Destroy(). PiperOrigin-RevId: 225448164 Change-Id: I99c6311b2e4fe5f6e2ca991424edf1ebeae9df32	2018-12-13 15:38:10 -08:00
Michael Pratt	24c1158b9c	Add "trace signal" option This option is effectively equivalent to -panic-signal, except that the sandbox does not die after logging the traceback. PiperOrigin-RevId: 225089593 Change-Id: Ifb1c411210110b6104613f404334bd02175e484e	2018-12-11 16:12:41 -08:00
Brian Geffon	d3bc79bc84	Open source system call tests. PiperOrigin-RevId: 224886231 Change-Id: I0fccb4d994601739d8b16b1d4e6b31f40297fb22	2018-12-10 14:42:34 -08:00
Nicolas Lacasse	833edbd10b	Internal change. PiperOrigin-RevId: 224865061 Change-Id: I6aa31f880931980ad2fc4c4b3cc4c532aacb31f4	2018-12-10 12:51:54 -08:00
Zhaozhong Ni	9984138abe	sentry: turn "dynamically-created" procfs files into static creation. PiperOrigin-RevId: 224600982 Change-Id: I547253528e24fb0bb318fc9d2632cb80504acb34	2018-12-07 17:03:54 -08:00
Andrei Vagin	1b1a42ba6d	A sandbox process should wait until it has not been moved into cgroups PiperOrigin-RevId: 224418900 Change-Id: I53cf4d7c1c70117875b6920f8fd3d58a3b1497e9	2018-12-06 15:28:29 -08:00
Brian Geffon	82719be42e	Max link traversals should be for an entire path. The number of symbolic links that are allowed to be followed are for a full path and not just a chain of symbolic links. PiperOrigin-RevId: 224047321 Change-Id: I5e3c4caf66a93c17eeddcc7f046d1e8bb9434a40	2018-12-04 14:32:03 -08:00
Googler	613899f852	Internal change. PiperOrigin-RevId: 223893409 Change-Id: I58869c7fb0012f6c3f7612a96cb649348b56335f	2018-12-03 17:27:35 -08:00
Googler	4d0da37cbb	Internal change. PiperOrigin-RevId: 223231273 Change-Id: I8fb97ea91f7507b4918f7ce6562890611513fc30	2018-11-28 14:01:48 -08:00
Kevin Krakauer	7b86d36a63	Fix crictl tests. gvisor-containerd-shim moved. It now has a stable URL that run_tests.sh always uses. PiperOrigin-RevId: 223188822 Change-Id: I5687c78289404da27becd8d5949371e580fdb360	2018-11-28 10:10:13 -08:00
Michael Pratt	071aeea9d3	Disable crictl tests gvisor-containerd-shim installation is currently broken. PiperOrigin-RevId: 223002877 Change-Id: I2b890c5bf602a96c475c3805f24852ead8593a35	2018-11-27 09:25:20 -08:00
Fabricio Voznika	eaac94d91c	Use RET_KILL_PROCESS if available in kernel RET_KILL_THREAD doesn't work well for Go because it will kill only the offending thread and leave the process hanging. RET_TRAP can be masked out and it's not guaranteed to kill the process. RET_KILL_PROCESS is available since 4.14. For older kernel, continue to use RET_TRAP as this is the best option (likely to kill process, easy to debug). PiperOrigin-RevId: 222357867 Change-Id: Icc1d7d731274b16c2125b7a1ba4f7883fbdb2cbd	2018-11-20 22:56:51 -08:00
Nicolas Lacasse	f894610c57	Use math.Rand to generate a random test container id. We were relying on time.UnixNano, but that was causing collisions. Now we generate 20 bytes of entropy from rand.Read, and base32-encode it to get a valid container id. PiperOrigin-RevId: 222313867 Change-Id: Iaeea9b9582d36de55f9f02f55de6a5de3f739371	2018-11-20 15:10:18 -08:00
Nicolas Lacasse	9363edcf06	Internal change. PiperOrigin-RevId: 222170431 Change-Id: I26a6d6ad5d6910a94bb8b0a05fc2d12e23098399	2018-11-20 14:04:41 -08:00
Fabricio Voznika	fadffa2ff8	Add unsupported syscall events for get/setsockopt PiperOrigin-RevId: 222148953 Change-Id: I21500a9f08939c45314a6414e0824490a973e5aa	2018-11-20 14:04:12 -08:00
Fabricio Voznika	237f9c7a5e	Don't fail when destroyContainerFS is called more than once This can happen when destroy is called multiple times or when destroy failed previously and is being called again. PiperOrigin-RevId: 221882034 Change-Id: I8d069af19cf66c4e2419bdf0d4b789c5def8d19e	2018-11-20 14:03:42 -08:00
Nicolas Lacasse	845836c578	Internal change. PiperOrigin-RevId: 221848471 Change-Id: I882fbe5ce7737048b2e1f668848e9c14ed355665	2018-11-20 14:03:11 -08:00
Nicolas Lacasse	adf8138e06	Allow sandbox.Wait to be called after the sandbox has exited. sandbox.Wait is racey, as the sandbox may have exited before it is called, or even during. We already had code to handle the case that the sandbox exits during the Wait call, but we were not properly handling the case where the sandbox has exited before the call. The best we can do in such cases is return the sandbox exit code as the application exit code. PiperOrigin-RevId: 221702517 Change-Id: I290d0333cc094c7c1c3b4ce0f17f61a3e908d787	2018-11-15 15:35:41 -08:00
Nicolas Lacasse	7b938ef78c	Internal change. PiperOrigin-RevId: 221462069 Change-Id: Id469ed21fe12e582c78340189b932989afa13c67	2018-11-14 09:58:43 -08:00
Nicolas Lacasse	40f843fc78	Internal change. PiperOrigin-RevId: 221343626 Change-Id: I03d57293a555cf4da9952a81803b9f8463173c89	2018-11-13 15:18:17 -08:00

1 2 3 4 5 ...

342 Commits