Commit Graph

194 Commits

Author SHA1 Message Date
Ian Lewis 8ea99d58ff Set the HOME environment variable for sub-containers.
Fixes #701

PiperOrigin-RevId: 316025635
2020-06-11 19:31:24 -07:00
Fabricio Voznika 4e96b94915 Combine executable lookup code
Run vs. exec, VFS1 vs. VFS2 were executable lookup were
slightly different from each other. Combine them all
into the same logic.

PiperOrigin-RevId: 315426443
2020-06-08 23:08:23 -07:00
Fabricio Voznika ca5912d13c More runsc changes for VFS2
- Add /tmp handling
- Apply mount options
- Enable more container_test tests
- Forward signals to child process when test respaws process
  to run as root inside namespace.

Updates #1487

PiperOrigin-RevId: 314263281
2020-06-01 21:32:09 -07:00
Fabricio Voznika f7418e2159 Move Cleanup to its own package
PiperOrigin-RevId: 313663382
2020-05-28 14:49:06 -07:00
Fabricio Voznika a8c1b32660 Automated rollback of changelist 309082540
PiperOrigin-RevId: 313636920
2020-05-28 12:25:57 -07:00
Fabricio Voznika 32ab382c80 Improve unsupported syscall message
PiperOrigin-RevId: 312104899
2020-05-18 10:23:22 -07:00
Jamie Liu d846077628 Enable overlayfs_stale_read by default for runsc.
Linux 4.18 and later make reads and writes coherent between pre-copy-up and
post-copy-up FDs representing the same file on an overlay filesystem. However,
memory mappings remain incoherent:

- Documentation/filesystems/overlayfs.rst, "Non-standard behavior": "If a file
  residing on a lower layer is opened for read-only and then memory mapped with
  MAP_SHARED, then subsequent changes to the file are not reflected in the
  memory mapping."

- fs/overlay/file.c:ovl_mmap() passes through to the underlying FD without any
  management of coherence in the overlay.

- Experimentally on Linux 5.2:

```
$ cat mmap_cat_page.c
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>

int main(int argc, char **argv) {
  if (argc < 2) {
    errx(1, "syntax: %s [FILE]", argv[0]);
  }
  const int fd = open(argv[1], O_RDONLY);
  if (fd < 0) {
    err(1, "open(%s)", argv[1]);
  }
  const size_t page_size = sysconf(_SC_PAGE_SIZE);
  void* page = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0);
  if (page == MAP_FAILED) {
    err(1, "mmap");
  }
  for (;;) {
    write(1, page, strnlen(page, page_size));
    if (getc(stdin) == EOF) {
      break;
    }
  }
  return 0;
}

$ gcc -O2 -o mmap_cat_page mmap_cat_page.c
$ mkdir lowerdir upperdir workdir overlaydir
$ echo old > lowerdir/file
$ sudo mount -t overlay -o "lowerdir=lowerdir,upperdir=upperdir,workdir=workdir" none overlaydir
$ ./mmap_cat_page overlaydir/file
old
^Z
[1]+  Stopped                 ./mmap_cat_page overlaydir/file
$ echo new > overlaydir/file
$ cat overlaydir/file
new
$ fg
./mmap_cat_page overlaydir/file

old
```

Therefore, while the VFS1 gofer client's behavior of reopening read FDs is only
necessary pre-4.18, replacing existing memory mappings (in both sentry and
application address spaces) with mappings of the new FD is required regardless
of kernel version, and this latter behavior is common to both VFS1 and VFS2.
Re-document accordingly, and change the runsc flag to enabled by default.

New test:
- Before this CL: https://source.cloud.google.com/results/invocations/5b222d2c-e918-4bae-afc4-407f5bac509b
- After this CL: https://source.cloud.google.com/results/invocations/f28c747e-d89c-4d8c-a461-602b33e71aab

PiperOrigin-RevId: 311361267
2020-05-13 10:53:37 -07:00
Fabricio Voznika e2b0e0e272 Enable TestRunNonRoot on VFS2
Also added back the default test dimension back which was
dropped in a previous refactor.

PiperOrigin-RevId: 309797327
2020-05-04 12:29:03 -07:00
Fabricio Voznika cbc5bef2a6 Add TTY support on VFS2 to runsc
Updates #1623, #1487

PiperOrigin-RevId: 309777922
2020-05-04 10:59:20 -07:00
gVisor bot d5c34ba2ff Merge pull request #2487 from moricho:fix/bindmount
PiperOrigin-RevId: 309082540
2020-04-29 13:13:51 -07:00
gVisor bot ceb3c0e062 Merge pull request #2558 from prattmic:forward_signal
PiperOrigin-RevId: 308829800
2020-04-28 08:43:49 -07:00
Michael Pratt b15d49a137 container: use sighandling package
Use the sighandling package for Container.ForwardSignals, for
consistency with other signal forwarding.

Fixes #2546
2020-04-27 11:52:43 -04:00
kevin.xu 9a4ae0322e
Update container.go
typo, should be `start` in comments
2020-04-27 21:53:04 +08:00
moricho fc53d64367 refactor and add test for bindmount
Signed-off-by: moricho <ikeda.morito@gmail.com>
2020-04-26 17:24:34 +09:00
Zach Koopmans 17ac90a203 Add container tests passing with VFS2
Several tests are passing after getting TestAppExitStatus (run /bin/true)
changes. Make versions that run via VFS2 so that we know what is and isn't
working.

In addition, fix bug in VFSFile ReadFull. For the TestExePath test in
container_test.go, the case "unmasked" will return 0 bytes read with no
EOF err, causing the ReadFull call to spin.

PiperOrigin-RevId: 308428126
2020-04-25 11:27:23 -07:00
Adin Scannell 1481499fe2 Simplify Docker test infrastructure.
This change adds a layer of abstraction around the internal Docker APIs,
and eliminates all direct dependencies on Dockerfiles in the infrastructure.

A subsequent change will automated the generation of local images (with
efficient caching). Note that this change drops the use of bazel container
rules, as that experiment does not seem to be viable.

PiperOrigin-RevId: 308095430
2020-04-23 11:33:30 -07:00
Fabricio Voznika a80cd43023 Add test name to boot and gofer log files
This is to make easier to find corresponding logs in
case test fails.

PiperOrigin-RevId: 307104283
2020-04-17 13:28:54 -07:00
Zach Koopmans 12bde95635 Get /bin/true to run on VFS2
Included:
- loader_test.go RunTest and TestStartSignal VFS2
- container_test.go TestAppExitStatus on VFS2
- experimental flag added to runsc to turn on VFS2

Note: shared mounts are not yet supported.
PiperOrigin-RevId: 307070753
2020-04-17 10:39:19 -07:00
Adin Scannell 928a7c60b8 Fix all printf formatting errors.
Updates #2243
2020-04-08 10:14:34 -07:00
Ian Lewis 5802051b3d Update TODO to #238
Move TODO to #238 so that proper synchronization of operations is handled
when we create the urpc client.

Issue #238
Fixes #512

PiperOrigin-RevId: 305383924
2020-04-07 18:39:33 -07:00
Fabricio Voznika f2e4b5ab93 Kill sandbox process when parent process terminates
When the sandbox runs in attached more, e.g. runsc do, runsc run, the
sandbox lifetime is controlled by the parent process. This wasn't working
in all cases because PR_GET_PDEATHSIG doesn't propagate through execve
when the process changes uid/gid. So it was getting dropped when the
sandbox execve's to change to user nobody.

PiperOrigin-RevId: 300601247
2020-03-12 12:32:26 -07:00
Andrei Vagin 6ec669631f tests: Don't print log messages on stdout
A parser of test results doesn't expect to see any extra messages.

PiperOrigin-RevId: 299174138
2020-03-05 13:08:04 -08:00
Andrei Vagin 80b40bbb06 tests: Don't print log messages on stdout
A parser of test results doesn't expect to see any extra messages.

PiperOrigin-RevId: 298966577
2020-03-04 16:16:35 -08:00
Fabricio Voznika 88f7369922 Log oom_score_adj value on error
Updates #1873

PiperOrigin-RevId: 297695241
2020-02-27 14:59:38 -08:00
Fabricio Voznika 4d7db46123 Add log during process wait in tests
TestMultiContainerKillAll timed out under --race. Without logging,
we cannot tell if the process list is still increasing, but slowly,
or is stuck.

PiperOrigin-RevId: 297158834
2020-02-25 11:14:47 -08:00
Adin Scannell 3e8b38d08b Add flag package to limit visibility.
PiperOrigin-RevId: 294297004
2020-02-10 13:57:01 -08:00
Ting-Yu Wang 386a1a1564 Fix TestPauseResume in container test failed with connection refused.
Sometimes we get this error under TSAN:
"""
error getting process data from container: connecting to control server at PID
XXXX: connection refused
"""

The theory is that the top "sleep 20" was too short for TSAN, and the container
already exited, so we get connected refused. This commit changes the test to
let container signaling it's running by touching a file repeatedly forever
during the test.

PiperOrigin-RevId: 293710957
2020-02-06 17:07:07 -08:00
Adin Scannell 1b6a12a768 Add notes to relevant tests.
These were out-of-band notes that can help provide additional context
and simplify automated imports.

PiperOrigin-RevId: 293525915
2020-02-05 22:46:35 -08:00
Kevin Krakauer 3f5642c5af Increase container_test size.
container_test was flaking because a small percentage of runs timed out. Tested
this fix with --runs_per_test=100.

PiperOrigin-RevId: 293240102
2020-02-04 15:38:53 -08:00
Adin Scannell d29e59af9f Standardize on tools directory.
PiperOrigin-RevId: 291745021
2020-01-27 12:21:00 -08:00
Ian Gudger 27500d529f New sync package.
* Rename syncutil to sync.
* Add aliases to sync types.
* Replace existing usage of standard library sync package.

This will make it easier to swap out synchronization primitives. For example,
this will allow us to use primitives from github.com/sasha-s/go-deadlock to
check for lock ordering violations.

Updates #1472

PiperOrigin-RevId: 289033387
2020-01-09 22:02:24 -08:00
Fabricio Voznika 0d475cdb01 Increase waitForProcessList timeout
It can take more than 10 seconds when running under --race.

PiperOrigin-RevId: 286296060
2019-12-18 17:10:44 -08:00
Andrei Vagin f8c5ad061b runsc/debug: add an option to list all processes
runsc debug --ps list all processes with all threads. This option is added to
the debug command but not to the ps command, because it is going to be used for
debug purposes and we want to add any useful information without thinking about
backward compatibility.

This will help to investigate syzkaller issues.

PiperOrigin-RevId: 285013668
2019-12-11 11:05:41 -08:00
Nicolas Lacasse 663fe840f7 Implement TTY field in control.Processes().
Threadgroups already know their TTY (if they have one), which now contains the
TTY Index, and is returned in the Processes() call.

PiperOrigin-RevId: 284263850
2019-12-06 14:34:13 -08:00
Fabricio Voznika ea7a100202 Make annotations OCI compliant
Changed annotation to follow the standard defined here:
https://github.com/opencontainers/image-spec/blob/master/annotations.md

PiperOrigin-RevId: 284254847
2019-12-06 13:51:38 -08:00
Fabricio Voznika ca90dad0e2 Fix container locking
Sandbox root dir was not being saved with the Container state,
so it would point to the wrong directory location when attempting
to lock the sandbox. This led to race conditions saving and
loading container state. Fixing it, led to multiple deadlocks.

I've moved the saving and locking logic to a separate struct and
moved the lock file inside the RootDir (instead of container
root dir), which allows the lock to be taken inside Destroy,
and removes the need to lock the sandbox.

PiperOrigin-RevId: 277599612
2019-10-30 15:39:04 -07:00
Fabricio Voznika e8ba10c008 Fix early deletion of rootDir
container.startContainers() cannot be called twice in a test
(e.g. TestMultiContainerLoadSandbox) because the cleanup
function deletes the rootDir, together with information from
all other containers that may exist.

PiperOrigin-RevId: 276591806
2019-10-24 16:36:54 -07:00
Tom Lanyon 7e8b5f4a3a Add runsc OCI annotations to support CRI-O.
Obligatory https://xkcd.com/927

Fixes #626
2019-10-20 21:11:01 +11:00
Fabricio Voznika 9fb562234e Fix problem with open FD when copy up is triggered in overlayfs
Linux kernel before 4.19 doesn't implement a feature that updates
open FD after a file is open for write (and is copied to the upper
layer). Already open FD will continue to read the old file content
until they are reopened. This is especially problematic for gVisor
because it caches open files.

Flag was added to force readonly files to be reopenned when the
same file is open for write. This is only needed if using kernels
prior to 4.19.

Closes #1006

It's difficult to really test this because we never run on tests
on older kernels. I'm adding a test in GKE which uses kernels
with the overlayfs problem for 1.14 and lower.

PiperOrigin-RevId: 275115289
2019-10-16 15:06:24 -07:00
Fabricio Voznika b9cdbc26bc Ignore mount options that are not supported in shared mounts
Options that do not change mount behavior inside the Sentry are
irrelevant and should not be used when looking for possible
incompatibilities between master and slave mounts.

PiperOrigin-RevId: 273593486
2019-10-08 13:36:16 -07:00
Fabricio Voznika 0b02c3d5e5 Prevent CAP_NET_RAW from appearing in exec
'docker exec' was getting CAP_NET_RAW even when --net-raw=false
because it was not filtered out from when copying container's
capabilities.

PiperOrigin-RevId: 272260451
2019-10-01 11:49:49 -07:00
Fabricio Voznika 010b093258 Bring back to life features lost in recent refactor
- Sandbox logs are generated when running tests
- Kokoro uploads the sandbox logs
- Supports multiple parallel runs
- Revive script to install locally built runsc with docker

PiperOrigin-RevId: 269337274
2019-09-16 08:17:00 -07:00
Ian Lewis 0bfffbcb01 Ignore the root container when calculating oom_score_adj for the sandbox.
This is done because the root container for CRI is the infrastructure (pause)
container and always gets a low oom_score_adj. We do this to ensure that only
the oom_score_adj of user containers is used to calculated the sandbox
oom_score_adj.

Implemented in runsc rather than the containerd shim as it's a bit cleaner to
implement here (in the shim it would require overwriting the oomScoreAdj and
re-writing out the config.json again). This processing is Kubernetes(CRI)
specific but we are currently only supporting CRI for multi-container support
anyway.

PiperOrigin-RevId: 267507706
2019-09-05 19:21:25 -07:00
Fabricio Voznika 0f5cdc1e00 Resolve flakes with TestMultiContainerDestroy
Some processes are reparented to the root container depending
on the kill order and the root container would not reap in time.
So some zombie processes were still present when the test checked.

Fix it by running the second container inside a PID namespace.

PiperOrigin-RevId: 267278591
2019-09-04 18:56:49 -07:00
Adin Scannell 67a2ab1438 Impose order on test scripts.
The simple test script has gotten out of control. Shard this script into
different pieces and attempt to impose order on overall test structure. This
change helps lay some of the foundations for future improvements.

 * The runsc/test directories are moved into just test/.
 * The runsc/test/testutil package is split into logical pieces.
 * The scripts/ directory contains new top-level targets.
 * Each test is now responsible for building targets it requires.
 * The install functionality is moved into `runsc` itself for simplicity.
 * The existing kokoro run_tests.sh file now just calls all (can be split).

After this change is merged,  I will create multiple distinct workflows for
Kokoro, one for each of the scripts currently targeted by `run_tests.sh` today,
which should dramatically reduce the time-to-run for the Kokoro tests, and
provides a better foundation for further improvements to the infrastructure.

PiperOrigin-RevId: 267081397
2019-09-03 22:02:43 -07:00
Fabricio Voznika c39564332b Mount volumes as super user
This used to be the case, but regressed after a recent change.
Also made a few fixes around it and clean up the code a bit.

Closes #720

PiperOrigin-RevId: 265717496
2019-08-27 10:47:16 -07:00
Fabricio Voznika 79cc4397fd Set gofer's OOM score adjustment
Updates #512

PiperOrigin-RevId: 262195448
2019-08-07 12:55:06 -07:00
Fabricio Voznika e70eafc9e5 Make loading container in a sandbox more robust
PiperOrigin-RevId: 262071646
2019-08-06 23:26:46 -07:00
Fabricio Voznika b461be88a8 Stops container if gofer is killed
Each gofer now has a goroutine that polls on the FDs used
to communicate with the sandbox. The respective gofer is
destroyed if any of the FDs is closed.

Closes #601

PiperOrigin-RevId: 261383725
2019-08-02 13:47:55 -07:00
Ian Lewis 3eff0531ad Set sandbox oom_score_adj
Set /proc/self/oom_score_adj based on oomScoreAdj specified in the OCI bundle.
When new containers are added to the sandbox oom_score_adj for the sandbox and
all other gofers are adjusted so that oom_score_adj is equal to the lowest
oom_score_adj of all containers in the sandbox.

Fixes #512

PiperOrigin-RevId: 261242725
2019-08-01 18:49:21 -07:00