Commit Graph

482 Commits

Author SHA1 Message Date
Fabricio Voznika 5f0002fc83 Use container's capabilities in exec
When no capabilities are specified in exec, use the
container's capabilities to match runc's behavior.

PiperOrigin-RevId: 211735186
Change-Id: Icd372ed64410c81144eae94f432dffc9fe3a86ce
2018-09-05 18:32:50 -07:00
Fabricio Voznika 41b56696c4 Imported FD in exec was leaking
Imported file needs to be closed after it's
been imported.

PiperOrigin-RevId: 211732472
Change-Id: Ia9249210558b77be076bcce465b832a22eed301f
2018-09-05 18:07:11 -07:00
Bert Muthalaly 5685d6b5ad Update {LinkEndpoint,NetworkEndpoint}#WritePacket to take a VectorisedView
Makes it possible to avoid copying or allocating in cases where DeliverNetworkPacket (rx)
needs to turn around and call WritePacket (tx) with its VectorisedView.

Also removes the restriction on having VectorisedViews with multiple views in the write path.

PiperOrigin-RevId: 211728717
Change-Id: Ie03a65ecb4e28bd15ebdb9c69f05eced18fdfcff
2018-09-05 17:34:25 -07:00
Tamir Duberstein fe8ca76c22 Implement Subnet removal
This was used to implement https://fuchsia-review.googlesource.com/c/garnet/+/177771.

PiperOrigin-RevId: 211725098
Change-Id: Ib0acc7c13430b7341e8e0ec6eb5fc35f5cee5083
2018-09-05 17:06:29 -07:00
Bert Muthalaly b3b66dbd1f Enable constructing a Prependable from a View without allocating.
PiperOrigin-RevId: 211722525
Change-Id: Ie73753fd09d67d6a2ce70cfe2d4ecf7275f09ce0
2018-09-05 16:47:51 -07:00
Fabricio Voznika 12aef686af Enabled bind mounts in sub-containers
With multi-gofers, bind mounts in sub-containers should
just work. Removed restrictions and added test. There are
also a few cleanups along the way, e.g. retry unmounting
in case cleanup races with gofer teardown.

PiperOrigin-RevId: 211699569
Change-Id: Ic0a69c29d7c31cd7e038909cc686c6ac98703374
2018-09-05 14:30:09 -07:00
Fabricio Voznika 0c7cfca0da Running container should have a valid sandbox
PiperOrigin-RevId: 211693868
Change-Id: Iea340dd78bf26ae6409c310b63c17cc611c2055f
2018-09-05 14:02:45 -07:00
Fabricio Voznika 4b57fd920d Add MADVISE to fsgofer seccomp profile
PiperOrigin-RevId: 211686037
Change-Id: I0e776ca760b65ba100e495f471b6e811dbd6590a
2018-09-05 13:18:06 -07:00
Fabricio Voznika 1d22d87fdc Move multi-container test to a single file
PiperOrigin-RevId: 211685288
Change-Id: I7872f2a83fcaaa54f385e6e567af6e72320c5aa0
2018-09-05 13:13:26 -07:00
Nicolas Lacasse f96b33c73c runsc: Promote getExecutablePathInternal to getExecutablePath.
Remove GetExecutablePath (the non-internal version).  This makes path handling
more consistent between exec, root, and child containers.

The new getExecutablePath now uses MountNamespace.FindInode, which is more
robust than Walking the Dirent tree ourselves.

This also removes the last use of lstat(2) in the sentry, so that can be
removed from the filters.

PiperOrigin-RevId: 211683110
Change-Id: Ic8ec960fc1c267aa7d310b8efe6e900c88a9207a
2018-09-05 13:01:21 -07:00
Tamir Duberstein bc5e18c9d1 Implement TCP keepalives
PiperOrigin-RevId: 211670620
Change-Id: Ia8a3d8ae53a7fece1dee08ee9c74964bd7f71bb7
2018-09-05 11:48:23 -07:00
Brian Geffon 2b8dae0bc5 Open(2) isn't honoring O_NOFOLLOW
PiperOrigin-RevId: 211644897
Change-Id: I882ed827a477d6c03576463ca5bf2d6351892b90
2018-09-05 09:21:28 -07:00
Nicolas Lacasse 0a9a40abcd runsc: Run sandbox as user nobody.
When starting a sandbox without direct file or network access, we create an
empty user namespace and run the sandbox in there.  However, the root user in
that namespace is still mapped to the root user in the parent namespace.

This CL maps the "nobody" user from the parent namespace into the child
namespace, and runs the sandbox process as user "nobody" inside the new
namespace.

PiperOrigin-RevId: 211572223
Change-Id: I1b1f9b1a86c0b4e7e5ca7bc93be7d4887678bab6
2018-09-04 20:33:05 -07:00
Nicolas Lacasse ad8648c634 runsc: Pass log and config files to sandbox process by FD.
This is a prereq for running the sandbox process as user "nobody", when it may
not have permissions to open these files.

Instead, we must open then before starting the sandbox process, and pass them
by FD.

The specutils.ReadSpecFromFile method was fixed to always seek to the beginning
of the file before reading. This allows Files from the same FD to be read
multiple times, as we do in the boot command when the apply-caps flag is set.

Tested with --network=host.

PiperOrigin-RevId: 211570647
Change-Id: I685be0a290aa7f70731ebdce82ebc0ebcc9d475c
2018-09-04 20:10:01 -07:00
Bhasker Hariharan 2cff07381a Automated rollback of changelist 211156845
PiperOrigin-RevId: 211525182
Change-Id: I462c20328955c77ecc7bfd8ee803ac91f15858e6
2018-09-04 14:31:52 -07:00
Lantao Liu 9ae4e28f75 runsc: fix container rootfs path.
PiperOrigin-RevId: 211515350
Change-Id: Ia495af57447c799909aa97bb873a50b87bee2625
2018-09-04 13:37:40 -07:00
Michael Pratt 3944cb41cb /proc/PID/mounts is not tab-delimited
PiperOrigin-RevId: 211513847
Change-Id: Ib484dd2d921c3e5d70d0e410cd973d3bff4f6b73
2018-09-04 13:29:49 -07:00
Michael Pratt ab7174611c Remove epoll_wait from filters
Go 1.11 replaced it with epoll_pwait.

PiperOrigin-RevId: 211510006
Change-Id: I48a6cae95ed3d57a4633895358ad05ad8bf2f633
2018-09-04 13:10:09 -07:00
Tamir Duberstein 3794cb6bff Expose TCP RTT
PiperOrigin-RevId: 211504634
Change-Id: I9a7bcbbdd40e5036894930f709278725ef477293
2018-09-04 12:39:47 -07:00
Adin Scannell c09f9acd7c Distinguish Element and Linker for ilist.
Furthermore, allow for the specification of an ElementMapper. This allows a
single "Element" type to exist on multiple inline lists, and work without
having to embed the entry type.

This is a requisite change for supporting a per-Inode list of Dirents.

PiperOrigin-RevId: 211467497
Change-Id: If2768999b43e03fdaecf8ed15f435fe37518d163
2018-09-04 09:19:11 -07:00
Fabricio Voznika 66c03b3dd7 Mounting over '/tmp' may fail
PiperOrigin-RevId: 211160120
Change-Id: Ie5f280bdac17afd01cb16562ffff6222b3184c34
2018-08-31 16:12:08 -07:00
Googler f0d8817654 Automated rollback of changelist 211103930
PiperOrigin-RevId: 211156845
Change-Id: Ie28011d7eb5f45f3a0158dbee2a68c5edf22f6e0
2018-08-31 15:48:50 -07:00
Jamie Liu f8ccfbbed4 Document more task-goroutine-owned fields in kernel.Task.
Task.creds can only be changed by the task's own set*id and execve
syscalls, and Task namespaces can only be changed by the task's own
unshare/setns syscalls.

PiperOrigin-RevId: 211156279
Change-Id: I94d57105d34e8739d964400995a8a5d76306b2a0
2018-08-31 15:44:40 -07:00
Fabricio Voznika 7713e2cb75 Remove not used deps
PiperOrigin-RevId: 211147521
Change-Id: I9b8b67df50a3ba084c07a48c72a874d7e2007f23
2018-08-31 14:47:46 -07:00
Jamie Liu b935311e23 Do not use fs.FileOwnerFromContext in fs/proc.file.UnstableAttr().
From //pkg/sentry/context/context.go:

// - It is *not safe* to retain a Context passed to a function beyond the scope
// of that function call.

Passing a stored kernel.Task as a context.Context to
fs.FileOwnerFromContext violates this requirement.

PiperOrigin-RevId: 211143021
Change-Id: I4c5b02bd941407be4c9cfdbcbdfe5a26acaec037
2018-08-31 14:17:56 -07:00
Jamie Liu 098046ba19 Disintegrate kernel.TaskResources.
This allows us to call kernel.FDMap.DecRef without holding mutexes
cleanly.

PiperOrigin-RevId: 211139657
Change-Id: Ie59d5210fb9282e1950e2e40323df7264a01bcec
2018-08-31 13:58:04 -07:00
Jamie Liu b1c1afa3cc Delete the long-obsolete kernel.TaskMaybe interface.
PiperOrigin-RevId: 211131855
Change-Id: Ia7799561ccd65d16269e0ae6f408ab53749bca37
2018-08-31 13:07:34 -07:00
Fabricio Voznika 7e18f158b2 Automated rollback of changelist 210995199
PiperOrigin-RevId: 211116429
Change-Id: I446d149c822177dc9fc3c64ce5e455f7f029aa82
2018-08-31 11:30:47 -07:00
Lantao Liu be9f454eb6 runsc: Set volume mount rslave.
PiperOrigin-RevId: 211111376
Change-Id: I27b8cb4e070d476fa4781ed6ecfa0cf1dcaf85f5
2018-08-31 11:03:22 -07:00
Tamir Duberstein 625edb9f28 ipv6: ICMP support
This CL does NDP link-address discovery for IPv6.

It includes several small changes necessary to get linux to talk to
this implementation. In particular, a hop limit of 255 is necessary
for ICMPv6.

PiperOrigin-RevId: 211103930
Change-Id: If25370ab84c6b1decfb15de917f3b0020f2c4e0e
2018-08-31 10:23:32 -07:00
Michael Pratt 08bfb5643c Add other missing dep
runsc and runsc-race need the same deps.

PiperOrigin-RevId: 211103766
Change-Id: Ib0c97078a469656c1e5b019648589a1d07915625
2018-08-31 10:22:09 -07:00
Fabricio Voznika e669697241 Fix RunAsRoot arguments forwarding
It was including the path to the executable twice in the
arguments.

PiperOrigin-RevId: 211098311
Change-Id: I5357c51c63f38dfab551b17bb0e04011a0575010
2018-08-31 09:45:32 -07:00
Tamir Duberstein 3f04bd68b2 Add missing import
GoCompile: missing strict dependencies:
	/tmpfs/tmp/bazel/sandbox/linux-sandbox/1744/execroot/__main__/runsc/main.go:
	import of "gvisor.googlesource.com/gvisor/runsc/specutils"

This was broken in 210995199.

PiperOrigin-RevId: 211086595
Change-Id: I166b9a2ed8e4d6e624def944b720190940d7537c
2018-08-31 08:07:52 -07:00
Fabricio Voznika 3e493adf7a Add seccomp filter to fsgofer
PiperOrigin-RevId: 211011542
Change-Id: Ib5a83a00f8eb6401603c6fb5b59afc93bac52558
2018-08-30 17:30:19 -07:00
Nicolas Lacasse 5ade9350ad runsc: Pass log and config files to sandbox process by FD.
This is a prereq for running the sandbox process as user "nobody", when it may
not have permissions to open these files.

Instead, we must open then before starting the sandbox process, and pass them
by FD.

PiperOrigin-RevId: 210995199
Change-Id: I715875a9553290b4a49394a8fcd93be78b1933dd
2018-08-30 15:47:18 -07:00
Nicolas Lacasse 8bfb5fa919 fs: Add empty dir at /sys/class/power_supply.
PiperOrigin-RevId: 210953512
Change-Id: I07d2d7fb0d268aa8eca26d81ef28b5b5c42289ee
2018-08-30 12:01:27 -07:00
Ian Gudger 313d4af52d ping: update comment about UDP
PiperOrigin-RevId: 210788012
Change-Id: I5ebdcf3d02bfab3484a1374fbccba870c9d68954
2018-08-29 14:15:58 -07:00
Nicolas Lacasse 956fe64ad6 fs: Fix renameMu lock recursion.
dirent.walk() takes renameMu, but is often called with renameMu already held,
which can lead to a deadlock.

Fix this by requiring renameMu to be held for reading when dirent.walk() is
called. This causes walks and existence checks to block while a rename
operation takes place, but that is what we were already trying to enforce by
taking renameMu in walk() anyways.

PiperOrigin-RevId: 210760780
Change-Id: Id61018e6e4adbeac53b9c1b3aa24ab77f75d8a54
2018-08-29 11:47:01 -07:00
Nicolas Lacasse 1893247616 fs: Drop reference to over-written file before renaming over it.
dirent.go:Rename() walks to the file being replaced and defers
replaced.DecRef(). After the rename, the reference is dropped, triggering a
writeout and SettAttr call to the gofer. Because of lazyOpenForWrite, the gofer
opens the replaced file BY ITS OLD NAME and calls ftruncate on it.

This CL changes Remove to drop the reference on replaced (and thus trigger
writeout) before the actual rename call.

PiperOrigin-RevId: 210756097
Change-Id: I01ea09a5ee6c2e2d464560362f09943641638e0f
2018-08-29 11:22:27 -07:00
Ian Gudger 52e6714146 fasync: don't keep mutex after return
PiperOrigin-RevId: 210637533
Change-Id: I3536c3f9efb54732a0d8ada8bc299142b2c1682f
2018-08-28 17:26:26 -07:00
Fabricio Voznika 30c025f3ef Add argument checks to seccomp
This is required to increase protection when running in GKE.

PiperOrigin-RevId: 210635123
Change-Id: Iaaa8be49e73f7a3a90805313885e75894416f0b5
2018-08-28 17:10:03 -07:00
Nicolas Lacasse 3b11769c77 fs: Don't bother saving negative dirents.
PiperOrigin-RevId: 210616454
Change-Id: I3f536e2b4d603e540cdd9a67c61b8ec3351f4ac3
2018-08-28 15:18:42 -07:00
Nicolas Lacasse 515d9bf43b fs: Add tests for dirent ref counting with an overlay.
PiperOrigin-RevId: 210614669
Change-Id: I408365ff6d6c7765ed7b789446d30e7079cbfc67
2018-08-28 15:09:17 -07:00
Zhaozhong Ni d724863a31 sentry: optimize dirent weakref map save / restore.
Weak references save / restore involves multiple interface indirection
and cause material latency overhead when there are lots of dirents, each
containing a weak reference map. The nil entries in the map should also
be purged.

PiperOrigin-RevId: 210593727
Change-Id: Ied6f4c3c0726fcc53a24b983d9b3a79121b6b758
2018-08-28 13:22:07 -07:00
Michael Pratt ea113a4380 Drop support for Go 1.10
PiperOrigin-RevId: 210589588
Change-Id: Iba898bc3eb8f13e17c668ceea6dc820fc8180a70
2018-08-28 12:56:28 -07:00
Lantao Liu d8f0db9bcf runsc: unmount volume mounts when destroy container.
PiperOrigin-RevId: 210579178
Change-Id: Iae20639c5186b1a976cbff6d05bda134cd00d0da
2018-08-28 11:54:07 -07:00
Fabricio Voznika f7366e4e64 Consolidate image tests into a single file
This is to keep it consistent with other test, and
it's easier to maintain them in single file.
Also increase python test timeout to deflake it.

PiperOrigin-RevId: 210575042
Change-Id: I2ef5bcd5d97c08549f0c5f645c4b694253ef0b4d
2018-08-28 11:31:04 -07:00
Michael Pratt 25a8e13a78 Bump to Go 1.11
The procid offset is unchanged.

PiperOrigin-RevId: 210551969
Change-Id: I33ba1ce56c2f5631b712417d870aa65ef24e6022
2018-08-28 09:22:41 -07:00
Zhaozhong Ni d08ccdaaad sentry: avoid double counting map objects in save / restore stats.
PiperOrigin-RevId: 210551929
Change-Id: Idd05935bffc63b39166cc3751139aff61b689faa
2018-08-28 09:21:16 -07:00
Fabricio Voznika ae648bafda Add command-line parameter to trigger panic on signal
This is to troubleshoot problems with a hung process that is
not responding to 'runsc debug --stack' command.

PiperOrigin-RevId: 210483513
Change-Id: I4377b210b4e51bc8a281ad34fd94f3df13d9187d
2018-08-27 20:36:10 -07:00