Commit Graph

189 Commits

Author SHA1 Message Date
Fabricio Voznika 26b08e182c Rename container in test
's' used to stand for sandbox, before container exited.

PiperOrigin-RevId: 213390641
Change-Id: I7bda94a50398c46721baa92227e32a7a1d817412
2018-09-17 21:18:27 -07:00
Kevin Krakauer bb88c187c5 runsc: Enable waiting on exited processes.
This makes `runsc wait` behave more like waitpid()/wait4() in that:
- Once a process has run to completion, you can wait on it and get its exit
  code.
- Processes not waited on will consume memory (like a zombie process)

PiperOrigin-RevId: 213358916
Change-Id: I5b5eca41ce71eea68e447380df8c38361a4d1558
2018-09-17 16:25:24 -07:00
Kevin Krakauer 25add7b22b runsc: Fix stdin/out/err in multi-container mode.
Stdin/out/err weren't being sent to the sentry.

PiperOrigin-RevId: 213307171
Change-Id: Ie4b634a58b1b69aa934ce8597e5cc7a47a2bcda2
2018-09-17 11:31:28 -07:00
Lantao Liu bde2a91433 runsc: Support container signal/wait.
This CL:
1) Fix `runsc wait`, it now also works after the container exits;
2) Generate correct container state in Load;
2) Make sure `Destory` cleanup everything before successfully return.

PiperOrigin-RevId: 212900107
Change-Id: Ie129cbb9d74f8151a18364f1fc0b2603eac4109a
2018-09-13 16:38:03 -07:00
Kevin Krakauer 2eff1fdd06 runsc: Add exec flag that specifies where to save the sandbox-internal pid.
This is different from the existing -pid-file flag, which saves a host pid.

PiperOrigin-RevId: 212713968
Change-Id: I2c486de8dd5cfd9b923fb0970165ef7c5fc597f0
2018-09-12 15:23:35 -07:00
Michael Pratt 0efde2bfbd Remove getdents from filters
It was only used by whitelistfs, which was removed in
bc81f3fe4a.

PiperOrigin-RevId: 212666374
Change-Id: Ia35e6dc9d68c1a3b015d5b5f71ea3e68e46c5bed
2018-09-12 10:51:25 -07:00
Michael Pratt b4aed01bf2 Rollback of changelist 212483372
PiperOrigin-RevId: 212557844
Change-Id: I414de848e75d57ecee2c05e851d05b607db4aa57
2018-09-11 17:54:50 -07:00
Nicolas Lacasse 6cc9b311af platform: Pass device fd into platform constructor.
We were previously openining the platform device (i.e. /dev/kvm) inside the
platfrom constructor (i.e. kvm.New).  This requires that we have RW access to
the platform device when constructing the platform.

However, now that the runsc sandbox process runs as user "nobody", it is not
able to open the platform device.

This CL changes the kvm constructor to take the platform device FD, rather than
opening the device file itself. The device file is opened outside of the
sandbox and passed to the sandbox process.

PiperOrigin-RevId: 212505804
Change-Id: I427e1d9de5eb84c84f19d513356e1bb148a52910
2018-09-11 13:09:46 -07:00
Fabricio Voznika c44bc6612f Allow fstatat back in syscall filters
PiperOrigin-RevId: 212483372
Change-Id: If95f32a8e41126cf3dc8bd6c8b2fb0fcfefedc6d
2018-09-11 11:05:09 -07:00
Nicolas Lacasse e198f9ab02 runsc: Chmod all mounted files to 777 inside chroot.
Inside the chroot, we run as user nobody, so all mounted files and directories
must be accessible to all users.

PiperOrigin-RevId: 212284805
Change-Id: I705e0dbbf15e01e04e0c7f378a99daffe6866807
2018-09-10 10:00:16 -07:00
Nicolas Lacasse 0c0c942327 Automated rollback of changelist 212059579
PiperOrigin-RevId: 212069131
Change-Id: I01476f957bbf29d4ee5a3c11d59d4f863ba9f2df
2018-09-07 18:23:27 -07:00
Nicolas Lacasse 922d8c3c8c Automated rollback of changelist 211992321
PiperOrigin-RevId: 212066419
Change-Id: Icded56e7e117bfd9b644e6541bddcd110460a9b8
2018-09-07 17:56:07 -07:00
Nicolas Lacasse 9751b800a6 runsc: Support multi-container exec.
We must use a context.Context with a Root Dirent that corresponds to the
container's chroot. Previously we were using the root context, which does not
have a chroot.

Getting the correct context required refactoring some of the path-lookup code.
We can't lookup the path without a context.Context, which requires
kernel.CreateProcArgs, which we only get inside control.Execute.  So we have to
do the path lookup much later than we previously were.

PiperOrigin-RevId: 212064734
Change-Id: I84a5cfadacb21fd9c3ab9c393f7e308a40b9b537
2018-09-07 17:39:54 -07:00
Fabricio Voznika cf5006ff24 Disable test until we figure out what's broken
PiperOrigin-RevId: 212059579
Change-Id: I052c2192d3483d7bd0fd2232ef2023a12da66446
2018-09-07 17:00:41 -07:00
Adin Scannell 6cfb5cd56d Add additional sanity checks for walk.
PiperOrigin-RevId: 212058684
Change-Id: I319709b9ffcfccb3231bac98df345d2a20eca24b
2018-09-07 16:53:12 -07:00
Fabricio Voznika 8ce3fbf9f8 Only start signal forwarding after init process is created
PiperOrigin-RevId: 212028121
Change-Id: If9c2c62f3be103e2bb556b8d154c169888e34369
2018-09-07 13:39:12 -07:00
Fabricio Voznika bc81f3fe4a Remove '--file-access=direct' option
It was used before gofer was implemented and it's not
supported anymore.
BREAKING CHANGE: proxy-shared and proxy-exclusive options
are now: shared and exclusive.

PiperOrigin-RevId: 212017643
Change-Id: If029d4073fe60583e5ca25f98abb2953de0d78fd
2018-09-07 12:28:48 -07:00
Fabricio Voznika f895cb4d8b Use root abstract socket namespace for exec
PiperOrigin-RevId: 211999211
Change-Id: I5968dd1a8313d3e49bb6e6614e130107495de41d
2018-09-07 10:45:55 -07:00
Nicolas Lacasse 210c252089 runsc: Run sandbox process inside minimal chroot.
We construct a dir with the executable bind-mounted at /exe, and proc mounted
at /proc.  Runsc now executes the sandbox process inside this chroot, thus
limiting access to the host filesystem.  The mounts and chroot dir are removed
when the sandbox is destroyed.

Because this requires bind-mounts, we can only do the chroot if we have
CAP_SYS_ADMIN.

PiperOrigin-RevId: 211994001
Change-Id: Ia71c515e26085e0b69b833e71691830148bc70d1
2018-09-07 10:16:39 -07:00
Nicolas Lacasse 590d832099 runsc: Dup debug log file to stderr, so sentry panics don't get lost.
Docker and containerd do not expose runsc's stderr, so tracking down sentry
panics can be painful.

If we have a debug log file, we should send panics (and all stderr data) to the
log file.

PiperOrigin-RevId: 211992321
Change-Id: I5f0d2f45f35c110a38dab86bafc695aaba42f7a3
2018-09-07 10:05:21 -07:00
Lantao Liu 4f3053cb4e runsc: do not delete in paused state.
PiperOrigin-RevId: 211835570
Change-Id: Ied7933732cad5bc60b762e9c964986cb49a8d9b9
2018-09-06 11:06:19 -07:00
Fabricio Voznika efac28976c Enable network for multi-container
PiperOrigin-RevId: 211834411
Change-Id: I52311a6c5407f984e5069359d9444027084e4d2a
2018-09-06 11:00:08 -07:00
Kevin Krakauer d95663a6b9 runsc testing: Move TestMultiContainerSignal to multi_container_test.
PiperOrigin-RevId: 211831396
Change-Id: Id67f182cb43dccb696180ec967f5b96176f252e0
2018-09-06 10:41:55 -07:00
Kevin Krakauer 8f0b6e7fc0 runsc: Support runsc kill multi-container.
Now, we can kill individual containers rather than the entire sandbox.

PiperOrigin-RevId: 211748106
Change-Id: Ic97e91db33d53782f838338c4a6d0aab7a313ead
2018-09-05 21:14:56 -07:00
Fabricio Voznika 5f0002fc83 Use container's capabilities in exec
When no capabilities are specified in exec, use the
container's capabilities to match runc's behavior.

PiperOrigin-RevId: 211735186
Change-Id: Icd372ed64410c81144eae94f432dffc9fe3a86ce
2018-09-05 18:32:50 -07:00
Fabricio Voznika 12aef686af Enabled bind mounts in sub-containers
With multi-gofers, bind mounts in sub-containers should
just work. Removed restrictions and added test. There are
also a few cleanups along the way, e.g. retry unmounting
in case cleanup races with gofer teardown.

PiperOrigin-RevId: 211699569
Change-Id: Ic0a69c29d7c31cd7e038909cc686c6ac98703374
2018-09-05 14:30:09 -07:00
Fabricio Voznika 0c7cfca0da Running container should have a valid sandbox
PiperOrigin-RevId: 211693868
Change-Id: Iea340dd78bf26ae6409c310b63c17cc611c2055f
2018-09-05 14:02:45 -07:00
Fabricio Voznika 4b57fd920d Add MADVISE to fsgofer seccomp profile
PiperOrigin-RevId: 211686037
Change-Id: I0e776ca760b65ba100e495f471b6e811dbd6590a
2018-09-05 13:18:06 -07:00
Fabricio Voznika 1d22d87fdc Move multi-container test to a single file
PiperOrigin-RevId: 211685288
Change-Id: I7872f2a83fcaaa54f385e6e567af6e72320c5aa0
2018-09-05 13:13:26 -07:00
Nicolas Lacasse f96b33c73c runsc: Promote getExecutablePathInternal to getExecutablePath.
Remove GetExecutablePath (the non-internal version).  This makes path handling
more consistent between exec, root, and child containers.

The new getExecutablePath now uses MountNamespace.FindInode, which is more
robust than Walking the Dirent tree ourselves.

This also removes the last use of lstat(2) in the sentry, so that can be
removed from the filters.

PiperOrigin-RevId: 211683110
Change-Id: Ic8ec960fc1c267aa7d310b8efe6e900c88a9207a
2018-09-05 13:01:21 -07:00
Nicolas Lacasse 0a9a40abcd runsc: Run sandbox as user nobody.
When starting a sandbox without direct file or network access, we create an
empty user namespace and run the sandbox in there.  However, the root user in
that namespace is still mapped to the root user in the parent namespace.

This CL maps the "nobody" user from the parent namespace into the child
namespace, and runs the sandbox process as user "nobody" inside the new
namespace.

PiperOrigin-RevId: 211572223
Change-Id: I1b1f9b1a86c0b4e7e5ca7bc93be7d4887678bab6
2018-09-04 20:33:05 -07:00
Nicolas Lacasse ad8648c634 runsc: Pass log and config files to sandbox process by FD.
This is a prereq for running the sandbox process as user "nobody", when it may
not have permissions to open these files.

Instead, we must open then before starting the sandbox process, and pass them
by FD.

The specutils.ReadSpecFromFile method was fixed to always seek to the beginning
of the file before reading. This allows Files from the same FD to be read
multiple times, as we do in the boot command when the apply-caps flag is set.

Tested with --network=host.

PiperOrigin-RevId: 211570647
Change-Id: I685be0a290aa7f70731ebdce82ebc0ebcc9d475c
2018-09-04 20:10:01 -07:00
Lantao Liu 9ae4e28f75 runsc: fix container rootfs path.
PiperOrigin-RevId: 211515350
Change-Id: Ia495af57447c799909aa97bb873a50b87bee2625
2018-09-04 13:37:40 -07:00
Michael Pratt ab7174611c Remove epoll_wait from filters
Go 1.11 replaced it with epoll_pwait.

PiperOrigin-RevId: 211510006
Change-Id: I48a6cae95ed3d57a4633895358ad05ad8bf2f633
2018-09-04 13:10:09 -07:00
Fabricio Voznika 66c03b3dd7 Mounting over '/tmp' may fail
PiperOrigin-RevId: 211160120
Change-Id: Ie5f280bdac17afd01cb16562ffff6222b3184c34
2018-08-31 16:12:08 -07:00
Fabricio Voznika 7713e2cb75 Remove not used deps
PiperOrigin-RevId: 211147521
Change-Id: I9b8b67df50a3ba084c07a48c72a874d7e2007f23
2018-08-31 14:47:46 -07:00
Fabricio Voznika 7e18f158b2 Automated rollback of changelist 210995199
PiperOrigin-RevId: 211116429
Change-Id: I446d149c822177dc9fc3c64ce5e455f7f029aa82
2018-08-31 11:30:47 -07:00
Lantao Liu be9f454eb6 runsc: Set volume mount rslave.
PiperOrigin-RevId: 211111376
Change-Id: I27b8cb4e070d476fa4781ed6ecfa0cf1dcaf85f5
2018-08-31 11:03:22 -07:00
Michael Pratt 08bfb5643c Add other missing dep
runsc and runsc-race need the same deps.

PiperOrigin-RevId: 211103766
Change-Id: Ib0c97078a469656c1e5b019648589a1d07915625
2018-08-31 10:22:09 -07:00
Fabricio Voznika e669697241 Fix RunAsRoot arguments forwarding
It was including the path to the executable twice in the
arguments.

PiperOrigin-RevId: 211098311
Change-Id: I5357c51c63f38dfab551b17bb0e04011a0575010
2018-08-31 09:45:32 -07:00
Tamir Duberstein 3f04bd68b2 Add missing import
GoCompile: missing strict dependencies:
	/tmpfs/tmp/bazel/sandbox/linux-sandbox/1744/execroot/__main__/runsc/main.go:
	import of "gvisor.googlesource.com/gvisor/runsc/specutils"

This was broken in 210995199.

PiperOrigin-RevId: 211086595
Change-Id: I166b9a2ed8e4d6e624def944b720190940d7537c
2018-08-31 08:07:52 -07:00
Fabricio Voznika 3e493adf7a Add seccomp filter to fsgofer
PiperOrigin-RevId: 211011542
Change-Id: Ib5a83a00f8eb6401603c6fb5b59afc93bac52558
2018-08-30 17:30:19 -07:00
Nicolas Lacasse 5ade9350ad runsc: Pass log and config files to sandbox process by FD.
This is a prereq for running the sandbox process as user "nobody", when it may
not have permissions to open these files.

Instead, we must open then before starting the sandbox process, and pass them
by FD.

PiperOrigin-RevId: 210995199
Change-Id: I715875a9553290b4a49394a8fcd93be78b1933dd
2018-08-30 15:47:18 -07:00
Fabricio Voznika 30c025f3ef Add argument checks to seccomp
This is required to increase protection when running in GKE.

PiperOrigin-RevId: 210635123
Change-Id: Iaaa8be49e73f7a3a90805313885e75894416f0b5
2018-08-28 17:10:03 -07:00
Michael Pratt ea113a4380 Drop support for Go 1.10
PiperOrigin-RevId: 210589588
Change-Id: Iba898bc3eb8f13e17c668ceea6dc820fc8180a70
2018-08-28 12:56:28 -07:00
Lantao Liu d8f0db9bcf runsc: unmount volume mounts when destroy container.
PiperOrigin-RevId: 210579178
Change-Id: Iae20639c5186b1a976cbff6d05bda134cd00d0da
2018-08-28 11:54:07 -07:00
Fabricio Voznika f7366e4e64 Consolidate image tests into a single file
This is to keep it consistent with other test, and
it's easier to maintain them in single file.
Also increase python test timeout to deflake it.

PiperOrigin-RevId: 210575042
Change-Id: I2ef5bcd5d97c08549f0c5f645c4b694253ef0b4d
2018-08-28 11:31:04 -07:00
Fabricio Voznika ae648bafda Add command-line parameter to trigger panic on signal
This is to troubleshoot problems with a hung process that is
not responding to 'runsc debug --stack' command.

PiperOrigin-RevId: 210483513
Change-Id: I4377b210b4e51bc8a281ad34fd94f3df13d9187d
2018-08-27 20:36:10 -07:00
Kevin Krakauer a4529c1b5b runsc: Fix readonly filesystem causing failure to create containers.
For readonly filesystems specified via relative path, we were forgetting to
mount relative to the container's bundle directory.

PiperOrigin-RevId: 210483388
Change-Id: I84809fce4b1f2056d0e225547cb611add5f74177
2018-08-27 20:34:27 -07:00
Nicolas Lacasse 0b3bfe2ea3 fs: Fix remote-revalidate cache policy.
When revalidating a Dirent, if the inode id is the same, then we don't need to
throw away the entire Dirent. We can just update the unstable attributes in
place.

If the inode id has changed, then the remote file has been deleted or moved,
and we have no choice but to throw away the dirent we have a look up another.
In this case, we may still end up losing a mounted dirent that is a child of
the revalidated dirent. However, that seems appropriate here because the entire
mount point has been pulled out from underneath us.

Because gVisor's overlay is at the Inode level rather than the Dirent level, we
must pass the parent Inode and name along with the Inode that is being
revalidated.

PiperOrigin-RevId: 210431270
Change-Id: I705caef9c68900234972d5aac4ae3a78c61c7d42
2018-08-27 14:26:29 -07:00