Commit Graph

74 Commits

Author SHA1 Message Date
gVisor bot 013d79d8e4 Merge pull request #4420 from workato:dev-options
PiperOrigin-RevId: 339363816
2020-10-27 17:22:26 -07:00
Konstantin Baranov 2b72da8bf9 Allow overriding mount options for /dev and /dev/pts
This is useful to optionally set /dev ro,noexec.

Treat /dev and /dev/pts the same as /proc and /sys.
Make sure the Type is right though. Many config.json snippets
on the Internet suggest /dev is tmpfs, not devtmpfs.
2020-10-26 18:02:52 -07:00
Fabricio Voznika 3ed8ace871 Fix nogo errors in specutils
PiperOrigin-RevId: 338780793
2020-10-23 18:35:45 -07:00
Howard Zhang ae1141778e fix seccomp test for ARM64
As open syscall is not support on ARM64, change syscall
from 'open' to 'openat' in no_match_name_allow

Signed-off-by: Howard Zhang <howard.zhang@arm.com>
2020-09-25 14:49:13 +08:00
Ian Lewis dcd532e2e4 Add support for OCI seccomp filters in the sandbox.
OCI configuration includes support for specifying seccomp filters. In runc,
these filter configurations are converted into seccomp BPF programs and loaded
into the kernel via libseccomp. runsc needs to be a static binary so, for
runsc, we cannot rely on a C library and need to implement the functionality
in Go.

The generator added here implements basic support for taking OCI seccomp
configuration and converting it into a seccomp BPF program with the same
behavior as a program generated by libseccomp.

- New conditional operations were added to pkg/seccomp to support operations
  available in OCI.
- AllowAny and AllowValue were renamed to MatchAny and EqualTo to better reflect
  that syscalls matching the conditionals result in the provided action not
  simply SCMP_RET_ALLOW.
- BuildProgram in pkg/seccomp no longer panics if provided an empty list of
  rules. It now builds a program with the architecture sanity check only.
- ProgramBuilder now allows adding labels that are unused. However, backwards
  jumps are still not permitted.

Fixes #510

PiperOrigin-RevId: 331938697
2020-09-15 23:19:17 -07:00
Fabricio Voznika 71589b7f7e Let flags be overriden from OCI annotations
This allows runsc flags to be set per sandbox instance. For
example, K8s pod annotations can be used to enable
--debug for a single pod, making troubleshoot much easier.
Similarly, features like --vfs2 can be enabled for
experimentation without affecting other pods in the node.

Closes #3494

PiperOrigin-RevId: 329542815
2020-09-01 11:12:19 -07:00
gVisor bot c81ac8ec3b Merge pull request #2672 from amscanne:shim-integrated
PiperOrigin-RevId: 321053634
2020-07-13 16:10:58 -07:00
Fabricio Voznika bae1475603 Print spec as json when --debug is enabled
The previous format skipped many important structs that
are pointers, especially for cgroups. Change to print
as json, removing parts of the spec that are not relevant.

Also removed debug message from gofer that can be very
noisy when directories are large.

PiperOrigin-RevId: 316713267
2020-06-16 10:52:17 -07:00
Fabricio Voznika ca5912d13c More runsc changes for VFS2
- Add /tmp handling
- Apply mount options
- Enable more container_test tests
- Forward signals to child process when test respaws process
  to run as root inside namespace.

Updates #1487

PiperOrigin-RevId: 314263281
2020-06-01 21:32:09 -07:00
Fabricio Voznika f7418e2159 Move Cleanup to its own package
PiperOrigin-RevId: 313663382
2020-05-28 14:49:06 -07:00
Fabricio Voznika a8c1b32660 Automated rollback of changelist 309082540
PiperOrigin-RevId: 313636920
2020-05-28 12:25:57 -07:00
moricho fc53d64367 refactor and add test for bindmount
Signed-off-by: moricho <ikeda.morito@gmail.com>
2020-04-26 17:24:34 +09:00
Ian Lewis daf3322498 Add logging message for noNewPrivileges OCI option.
noNewPrivileges is ignored if set to false since gVisor assumes that
PR_SET_NO_NEW_PRIVS is always enabled.

PiperOrigin-RevId: 305991947
2020-04-10 20:32:23 -07:00
Ian Lewis 56054fc1fb Add friendlier messages for frequently encountered errors.
Issue #2270
Issue #1765

PiperOrigin-RevId: 305385436
2020-04-07 18:51:01 -07:00
Fabricio Voznika f2e4b5ab93 Kill sandbox process when parent process terminates
When the sandbox runs in attached more, e.g. runsc do, runsc run, the
sandbox lifetime is controlled by the parent process. This wasn't working
in all cases because PR_GET_PDEATHSIG doesn't propagate through execve
when the process changes uid/gid. So it was getting dropped when the
sandbox execve's to change to user nobody.

PiperOrigin-RevId: 300601247
2020-03-12 12:32:26 -07:00
Adin Scannell d29e59af9f Standardize on tools directory.
PiperOrigin-RevId: 291745021
2020-01-27 12:21:00 -08:00
gVisor bot 6122b413f1 Merge pull request #1046 from tomlanyon:crio
PiperOrigin-RevId: 276172466
2019-10-22 17:05:04 -07:00
Tom Lanyon 7e8b5f4a3a Add runsc OCI annotations to support CRI-O.
Obligatory https://xkcd.com/927

Fixes #626
2019-10-20 21:11:01 +11:00
Ian Lewis 065339193e Update TODO for OCI seccomp support.
PiperOrigin-RevId: 274042343
2019-10-10 14:43:03 -07:00
gVisor bot 90e908f419 Merge pull request #917 from KentaTada:fix-clone-flags
PiperOrigin-RevId: 272262368
2019-10-01 12:06:30 -07:00
Fabricio Voznika 0b02c3d5e5 Prevent CAP_NET_RAW from appearing in exec
'docker exec' was getting CAP_NET_RAW even when --net-raw=false
because it was not filtered out from when copying container's
capabilities.

PiperOrigin-RevId: 272260451
2019-10-01 11:49:49 -07:00
Kenta Tada 69f3c79b24 runsc: add the clone flag of cgroup namespace
Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>
2019-09-26 12:02:01 +09:00
Fabricio Voznika 010b093258 Bring back to life features lost in recent refactor
- Sandbox logs are generated when running tests
- Kokoro uploads the sandbox logs
- Supports multiple parallel runs
- Revive script to install locally built runsc with docker

PiperOrigin-RevId: 269337274
2019-09-16 08:17:00 -07:00
Ian Lewis 0bfffbcb01 Ignore the root container when calculating oom_score_adj for the sandbox.
This is done because the root container for CRI is the infrastructure (pause)
container and always gets a low oom_score_adj. We do this to ensure that only
the oom_score_adj of user containers is used to calculated the sandbox
oom_score_adj.

Implemented in runsc rather than the containerd shim as it's a bit cleaner to
implement here (in the shim it would require overwriting the oomScoreAdj and
re-writing out the config.json again). This processing is Kubernetes(CRI)
specific but we are currently only supporting CRI for multi-container support
anyway.

PiperOrigin-RevId: 267507706
2019-09-05 19:21:25 -07:00
Fabricio Voznika b461be88a8 Stops container if gofer is killed
Each gofer now has a goroutine that polls on the FDs used
to communicate with the sandbox. The respective gofer is
destroyed if any of the FDs is closed.

Closes #601

PiperOrigin-RevId: 261383725
2019-08-02 13:47:55 -07:00
Michael Pratt 5b41ba5d0e Fix various spelling issues in the documentation
Addresses obvious typos, in the documentation only.

COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/443 from Pixep:fix/documentation-spelling 4d0688164eafaf0b3010e5f4824b35d1e7176d65
PiperOrigin-RevId: 255477779
2019-06-27 14:25:50 -07:00
Adin Scannell add40fd6ad Update canonical repository.
This can be merged after:
https://github.com/google/gvisor-website/pull/77
  or
https://github.com/google/gvisor-website/pull/78

PiperOrigin-RevId: 253132620
2019-06-13 16:50:15 -07:00
Fabricio Voznika 356d1be140 Allow 'runsc do' to run without root
'--rootless' flag lets a non-root user execute 'runsc do'.
The drawback is that the sandbox and gofer processes will
run as root inside a user namespace that is mapped to the
caller's user, intead of nobody. And network is defaulted
to '--network=host' inside the root network namespace. On
the bright side, it's very convenient for testing:

runsc --rootless do ls
runsc --rootless do curl www.google.com

PiperOrigin-RevId: 252840970
2019-06-12 09:41:50 -07:00
Fabricio Voznika fc746efa9a Add support to mount pod shared tmpfs mounts
Parse annotations containing 'gvisor.dev/spec/mount' that gives
hints about how mounts are shared between containers inside a
pod. This information can be used to better inform how to mount
these volumes inside gVisor. For example, a volume that is shared
between containers inside a pod can be bind mounted inside the
sandbox, instead of being two independent mounts.

For now, this information is used to allow the same tmpfs mounts
to be shared between containers which wasn't possible before.

PiperOrigin-RevId: 252704037
2019-06-11 14:54:31 -07:00
Liu Hua fc9f7e3590 tiny fix: avoid panicing when OpenSpec failed
Signed-off-by: Liu Hua <sdu.liu@huawei.com>
Change-Id: I11a4620394a10a7d92036b0341e0c21ad50bd122
PiperOrigin-RevId: 248621810
2019-05-16 16:20:42 -07:00
Michael Pratt 4d52a55201 Change copyright notice to "The gVisor Authors"
Based on the guidelines at
https://opensource.google.com/docs/releasing/authors/.

1. $ rg -l "Google LLC" | xargs sed -i 's/Google LLC.*/The gVisor Authors./'
2. Manual fixup of "Google Inc" references.
3. Add AUTHORS file. Authors may request to be added to this file.
4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS.

Fixes #209

PiperOrigin-RevId: 245823212
Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
2019-04-29 14:26:23 -07:00
Nicolas Lacasse f4ce43e1f4 Allow and document bug ids in gVisor codebase.
PiperOrigin-RevId: 245818639
Change-Id: I03703ef0fb9b6675955637b9fe2776204c545789
2019-04-29 14:04:14 -07:00
Kevin Krakauer 43dff57b87 Make raw sockets a toggleable feature disabled by default.
PiperOrigin-RevId: 245511019
Change-Id: Ia9562a301b46458988a6a1f0bbd5f07cbfcb0615
2019-04-26 16:51:46 -07:00
Andrei Vagin 88409e983c gvisor: Add support for the MS_NOEXEC mount option
https://github.com/google/gvisor/issues/145

PiperOrigin-RevId: 242044115
Change-Id: I8f140fe05e32ecd438b6be218e224e4b7fe05878
2019-04-04 17:43:53 -07:00
Adin Scannell 7543e9ec20 Add release hook and version flag
PiperOrigin-RevId: 241421671
Change-Id: Ic0cebfe3efd458dc42c49f7f812c13318705199a
2019-04-01 16:18:43 -07:00
Fabricio Voznika c7877b0a14 Fail in case mount option is unknown
PiperOrigin-RevId: 239425816
Change-Id: I3b1479c61b4222c3931a416c4efc909157044330
2019-03-20 10:36:20 -07:00
Fabricio Voznika e420cc3e5d Add support for mount propagation
Properly handle propagation options for root and mounts. Now usage of
mount options shared, rshared, and noexec cause error to start. shared/
rshared breaks sandbox=>host isolation. slave however can be supported
because changes propagate from host to sandbox.

Root FS setup moved inside the gofer. Apart from simplifying the code,
it keeps all mounts inside the namespace. And they are torn down when
the namespace is destroyed (DestroyFS is no longer needed).

PiperOrigin-RevId: 239037661
Change-Id: I8b5ee4d50da33c042ea34fa68e56514ebe20e6e0
2019-03-18 12:30:43 -07:00
Michael Pratt 2a0c69b19f Remove license comments
Nothing reads them and they can simply get stale.

Generated with:
$ sed -i "s/licenses(\(.*\)).*/licenses(\1)/" **/BUILD

PiperOrigin-RevId: 231818945
Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0
2019-01-31 11:12:53 -08:00
Andrei Vagin 5f08f8fd81 Don't bind-mount runsc into a sandbox mntns
PiperOrigin-RevId: 230437407
Change-Id: Id9d8ceeb018aad2fe317407c78c6ee0f4b47aa2b
2019-01-22 16:46:42 -08:00
Fabricio Voznika 0d7023d581 Restore to original cgroup after sandbox and gofer processes are created
The original code assumed that it was safe to join and not restore cgroup,
but Container.Run will not exit after calling start, making cgroup cleanup
fail because there were still processes inside the cgroup.

PiperOrigin-RevId: 228529199
Change-Id: I12a48d9adab4bbb02f20d71ec99598c336cbfe51
2019-01-09 09:18:15 -08:00
Brian Geffon d3bc79bc84 Open source system call tests.
PiperOrigin-RevId: 224886231
Change-Id: I0fccb4d994601739d8b16b1d4e6b31f40297fb22
2018-12-10 14:42:34 -08:00
Nicolas Lacasse 845836c578 Internal change.
PiperOrigin-RevId: 221848471
Change-Id: I882fbe5ce7737048b2e1f668848e9c14ed355665
2018-11-20 14:03:11 -08:00
Nicolas Lacasse 7b938ef78c Internal change.
PiperOrigin-RevId: 221462069
Change-Id: Id469ed21fe12e582c78340189b932989afa13c67
2018-11-14 09:58:43 -08:00
Nicolas Lacasse 7f558eda44 Internal change.
PiperOrigin-RevId: 221343421
Change-Id: I418b5204c5ed4fe1e0af25ef36ee66b9b571928e
2018-11-13 15:17:19 -08:00
Adin Scannell 75cd70ecc9 Track paths and provide a rename hook.
This change also adds extensive testing to the p9 package via mocks. The sanity
checks and type checks are moved from the gofer into the core package, where
they can be more easily validated.

PiperOrigin-RevId: 218296768
Change-Id: I4fc3c326e7bf1e0e140a454cbacbcc6fd617ab55
2018-10-23 00:20:15 -07:00
Ian Gudger 8fce67af24 Use correct company name in copyright header
PiperOrigin-RevId: 217951017
Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
2018-10-19 16:35:11 -07:00
Fabricio Voznika f3ffa4db52 Resolve mount paths while setting up root fs mount
It's hard to resolve symlinks inside the sandbox because rootfs and mounts
may be read-only, forcing us to create mount points inside lower layer of an
overlay, **before** the volumes are mounted.

Since the destination must already be resolved outside the sandbox when creating
mounts, take this opportunity to rewrite the spec with paths resolved.
"runsc boot" will use the "resolved" spec to load mounts. In addition, symlink
traversals were disabled while mounting containers inside the sandbox.

It haven't been able to write a good test for it. So I'm relying on manual tests
for now.

PiperOrigin-RevId: 217749904
Change-Id: I7ac434d5befd230db1488446cda03300cc0751a9
2018-10-18 12:42:24 -07:00
Fabricio Voznika e68d86e1bd Make debug log file name configurable
This is a breaking change if you're using --debug-log-dir.
The fix is to replace it with --debug-log and add a '/' at
the end:
  --debug-log-dir=/tmp/runsc ==> --debug-log=/tmp/runsc/

PiperOrigin-RevId: 216761212
Change-Id: I244270a0a522298c48115719fa08dad55e34ade1
2018-10-11 14:29:37 -07:00
Fabricio Voznika 29cd05a7c6 Add sandbox to cgroup
Sandbox creation uses the limits and reservations configured in the
OCI spec and set cgroup options accordinly. Then it puts both the
sandbox and gofer processes inside the cgroup.

It also allows the cgroup to be pre-configured by the caller. If the
cgroup already exists, sandbox and gofer processes will join the
cgroup but it will not modify the cgroup with spec limits.

PiperOrigin-RevId: 216538209
Change-Id: If2c65ffedf55820baab743a0edcfb091b89c1019
2018-10-10 09:00:42 -07:00
Fabricio Voznika cf226d48ce Switch to root in userns when CAP_SYS_CHROOT is also missing
Some tests check current capabilities and re-run the tests as root inside
userns if required capabibilities are missing. It was checking for
CAP_SYS_ADMIN only, CAP_SYS_CHROOT is also required now.

PiperOrigin-RevId: 214949226
Change-Id: Ic81363969fa76c04da408fae8ea7520653266312
2018-09-28 09:44:13 -07:00