Commit Graph

229 Commits

Author SHA1 Message Date
Ting-Yu Wang 120c8e3468 Replace TaskFromContext(ctx).Kernel() with KernelFromContext(ctx)
Panic seen at some code path like control.ExecAsync where
ctx does not have a Task.

Reported-by: syzbot+55ce727161cf94a7b7d6@syzkaller.appspotmail.com
PiperOrigin-RevId: 355960596
2021-02-05 17:28:01 -08:00
Kevin Krakauer 5f7bf31526 Stub out basic `runsc events --stat` CPU functionality
Because we lack gVisor-internal cgroups, we take the CPU usage of the entire pod
and divide it proportionally according to sentry-internal usage stats.

This fixes `kubectl top pods`, which gets a pod's CPU usage by summing the usage
of its containers.

Addresses #172.

PiperOrigin-RevId: 355229833
2021-02-02 12:47:23 -08:00
Fabricio Voznika aae4803808 Enable container checkpoint/restore tests with VFS2
Updates #1663

PiperOrigin-RevId: 355077816
2021-02-01 19:29:29 -08:00
Fabricio Voznika f14f3ba3ef Fix TestDuplicateEnvVariable flakyness
Updates #5226

PiperOrigin-RevId: 353262133
2021-01-22 09:57:44 -08:00
Adin Scannell 4e03e87547 Fix simple mistakes identified by goreportcard.
These are primarily simplification and lint mistakes. However, minor
fixes are also included and tests added where appropriate.

PiperOrigin-RevId: 351425971
2021-01-12 12:38:22 -08:00
Fabricio Voznika 7e462a1c7f OCI spec may contain duplicate environment variables
Closes #5226

PiperOrigin-RevId: 351259576
2021-01-11 16:25:50 -08:00
Fabricio Voznika 8ea19b5818 Add sandbox ID to state file name
This allows to find all containers inside a sandbox more efficiently.
This operation is required every time a container starts and stops,
and previously required loading *all* container state files to check
whether the container belonged to the sandbox.

Apert from being inneficient, it has caused problems when state files
are stale or corrupt, causing inavalability to create any container.

Also adjust commands `list` and `debug` to skip over files that fail
to load.

Resolves #5052

PiperOrigin-RevId: 348050637
2020-12-17 10:52:44 -08:00
Adin Scannell 80552b936d Support partitions for other tests.
PiperOrigin-RevId: 345399936
2020-12-03 01:00:21 -08:00
Fabricio Voznika 7158095d68 Fix race condition in multi-container wait test
Container is not thread-safe, locking must be done in the caller.
The test was calling Container.Wait() from multiple threads with
no synchronization.

Also removed Container.WaitPID from test because the process might
have already existed when wait is called.

PiperOrigin-RevId: 343176280
2020-11-18 16:06:31 -08:00
Fabricio Voznika e2d9a68eef Add support for TTY in multi-container
Fixes #2714

PiperOrigin-RevId: 342950412
2020-11-17 14:51:24 -08:00
Fabricio Voznika 0e8fdfd388 Re-add start/stop container tests
Due to a type doDestroyNotStartedTest was being tested
2x instead of doDestroyStartingTest.

PiperOrigin-RevId: 340969797
2020-11-05 19:06:43 -08:00
Fabricio Voznika 62b0e845b7 Return failure when `runsc events` queries a stopped container
This was causing gvisor-containerd-shim to crash because the command
suceeded, but there was no stat present.

PiperOrigin-RevId: 340964921
2020-11-05 18:18:21 -08:00
Fabricio Voznika c47f8afe23 Fix failure setting OOM score adjustment
When OOM score adjustment needs to be set, all the containers need to be
loaded to find all containers that belong to the sandbox. However, each
load signals the container to ensure it is still alive. OOM score
adjustment is set during creation and deletion of every container, generating
a flood of signals to all containers. The fix removes the signal check
when it's not needed.

There is also a race fetching OOM score adjustment value from the parent when
the sandbox exits at the same  time (the time it took to signal containers above
made this window quite large). The fix is to store the original value
in the sandbox state file and use it when the value needs to be restored.

Also add more logging and made the existing ones more consistent to help with
debugging.

PiperOrigin-RevId: 340940799
2020-11-05 15:36:20 -08:00
gVisor bot 1a5eb49a43 Merge pull request #3957 from workato:auto-cgroup
PiperOrigin-RevId: 338372736
2020-10-21 17:24:06 -07:00
Konstantin Baranov d579ed8505 Do not even try forcing cgroups in tests 2020-10-20 20:03:04 -07:00
Fabricio Voznika 4b4d12d5bb Fixes to cgroups
There were a few problems with cgroups:
- cleanup loop what breaking too early
- parse of /proc/[pid]/cgroups was skipping "name=systemd"
  because "name=" was not being removed from name.
- When no limits are specified, fillFromAncestor was not being
  called, causing a failure to set cpuset.mems

Updates #4536

PiperOrigin-RevId: 337947356
2020-10-19 15:32:50 -07:00
Konstantin Baranov a2a27eedf4 Ignore errors in rootless and test modes 2020-10-06 15:34:02 -07:00
Fabricio Voznika 9e64b9f3a5 Fix gofer monitor prematurely destroying container
When all container tasks finish, they release the mount which in turn
will close the 9P session to the gofer. The gofer exits when the connection
closes, triggering the gofer monitor. The gofer monitor will _think_ that
the gofer died prematurely and destroy the container. Then when the caller
attempts to wait for the container, e.g. to get the exit code, wait fails
saying the container doesn't exist.

Gofer monitor now just SIGKILLs the container, and let the normal teardown
process to happen, which will evetually destroy the container at the right
time. Also, fixed an issue with exec racing with container's init process
exiting.

Closes #1487

PiperOrigin-RevId: 335537350
2020-10-05 17:40:23 -07:00
Fabricio Voznika 9e9fec3a09 Enable more VFS2 tests
Updates #1487

PiperOrigin-RevId: 335516732
2020-10-05 15:54:36 -07:00
Konstantin Baranov 6321eccddc Treat absent "linux" section is empty "cgroupsPath" too 2020-10-02 14:37:55 -07:00
Howard Zhang d47209b86d fix TestUserLog for multi-arch
based on arch, apply different syscall number for
sched_rr_get_interval

Signed-off-by: Howard Zhang <howard.zhang@arm.com>
2020-09-25 14:48:37 +08:00
Fabricio Voznika da07e38f7c Remove option to panic gofer
Gofer panics are suppressed by p9 server and an error
is returned to the caller, making it effectively the
same as returning EROFS.

PiperOrigin-RevId: 332282959
2020-09-17 12:01:45 -07:00
Fabricio Voznika a11061d78a Add VFS2 overlay support in runsc
All tests under runsc are passing with overlay enabled.

Updates #1487, #1199

PiperOrigin-RevId: 332181267
2020-09-17 01:09:42 -07:00
Fabricio Voznika 326a1dbb73 Refactor removed default test dimension
ptrace was always selected as a dimension before, but not
anymore. Some tests were specifying "overlay" expecting that
to be in addition to the default.

PiperOrigin-RevId: 332004111
2020-09-16 07:47:28 -07:00
Konstantin Baranov b8dc9a889f Use container ID as cgroup name if not provided
Useful when you want to run multiple containers with the same config.
And runc does that too.
2020-09-15 20:50:07 -07:00
Fabricio Voznika c8f1ce288d Honor readonly flag for root mount
Updates #1487

PiperOrigin-RevId: 330580699
2020-09-08 14:00:43 -07:00
Fabricio Voznika 2202812e07 Simplify FD handling for container start/exec
VFS1 and VFS2 host FDs have different dupping behavior,
making error prone to code for both. Change the contract
so that FDs are released as they are used, so the caller
can simple defer a block that closes all remaining files.
This also addresses handling of partial failures.

With this fix, more VFS2 tests can be enabled.

Updates #1487

PiperOrigin-RevId: 330112266
2020-09-04 11:42:02 -07:00
Ayush Ranjan 2eaf54dd59 Refactor tty codebase to use master-replica terminology.
Updates #2972

PiperOrigin-RevId: 329584905
2020-09-01 14:43:41 -07:00
Fabricio Voznika be76c7ce6e Move boot.Config to its own package
Updates #3494

PiperOrigin-RevId: 327548511
2020-08-19 18:37:42 -07:00
Adin Scannell d0fd97541a Clean-up bazel wrapper.
The bazel server was being started as the wrong user, leading to issues
where the container would suddenly exit during a build.

We can also simplify the waiting logic by starting the container in two
separate steps: those that must complete first, then the asynchronous bit.

PiperOrigin-RevId: 323391161
2020-07-27 10:40:29 -07:00
gVisor bot bdbab2702a Merge pull request #3022 from prattmic:runsc_do_pdeathsig
PiperOrigin-RevId: 321449877
2020-07-15 15:21:32 -07:00
Michael Pratt 1481673178 Apply pdeathsig to gofer for runsc run/do
Much like the boot process, apply pdeathsig to the gofer for cases where
the sandbox lifecycle is attached to the parent (runsc run/do).

This isn't strictly necessary, as the gofer normally exits once the
sentry disappears, but this makes that extra reliable.
2020-07-15 15:15:11 -04:00
Fabricio Voznika 1bfb556ccd Prepare boot.Loader to support multi-container TTY
- Combine process creation code that is shared between
  root and subcontainer processes
- Move root container information into a struct for
  clarity

Updates #2714

PiperOrigin-RevId: 321204798
2020-07-14 12:02:03 -07:00
gVisor bot c81ac8ec3b Merge pull request #2672 from amscanne:shim-integrated
PiperOrigin-RevId: 321053634
2020-07-13 16:10:58 -07:00
Fabricio Voznika c4815af947 Add shared mount hints to VFS2
Container restart test is disabled for VFS2 for now.

Updates #1487

PiperOrigin-RevId: 320296401
2020-07-08 17:12:29 -07:00
Ian Lewis 8ea99d58ff Set the HOME environment variable for sub-containers.
Fixes #701

PiperOrigin-RevId: 316025635
2020-06-11 19:31:24 -07:00
Fabricio Voznika 4e96b94915 Combine executable lookup code
Run vs. exec, VFS1 vs. VFS2 were executable lookup were
slightly different from each other. Combine them all
into the same logic.

PiperOrigin-RevId: 315426443
2020-06-08 23:08:23 -07:00
Fabricio Voznika ca5912d13c More runsc changes for VFS2
- Add /tmp handling
- Apply mount options
- Enable more container_test tests
- Forward signals to child process when test respaws process
  to run as root inside namespace.

Updates #1487

PiperOrigin-RevId: 314263281
2020-06-01 21:32:09 -07:00
Fabricio Voznika f7418e2159 Move Cleanup to its own package
PiperOrigin-RevId: 313663382
2020-05-28 14:49:06 -07:00
Fabricio Voznika a8c1b32660 Automated rollback of changelist 309082540
PiperOrigin-RevId: 313636920
2020-05-28 12:25:57 -07:00
Fabricio Voznika 32ab382c80 Improve unsupported syscall message
PiperOrigin-RevId: 312104899
2020-05-18 10:23:22 -07:00
Jamie Liu d846077628 Enable overlayfs_stale_read by default for runsc.
Linux 4.18 and later make reads and writes coherent between pre-copy-up and
post-copy-up FDs representing the same file on an overlay filesystem. However,
memory mappings remain incoherent:

- Documentation/filesystems/overlayfs.rst, "Non-standard behavior": "If a file
  residing on a lower layer is opened for read-only and then memory mapped with
  MAP_SHARED, then subsequent changes to the file are not reflected in the
  memory mapping."

- fs/overlay/file.c:ovl_mmap() passes through to the underlying FD without any
  management of coherence in the overlay.

- Experimentally on Linux 5.2:

```
$ cat mmap_cat_page.c
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>

int main(int argc, char **argv) {
  if (argc < 2) {
    errx(1, "syntax: %s [FILE]", argv[0]);
  }
  const int fd = open(argv[1], O_RDONLY);
  if (fd < 0) {
    err(1, "open(%s)", argv[1]);
  }
  const size_t page_size = sysconf(_SC_PAGE_SIZE);
  void* page = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0);
  if (page == MAP_FAILED) {
    err(1, "mmap");
  }
  for (;;) {
    write(1, page, strnlen(page, page_size));
    if (getc(stdin) == EOF) {
      break;
    }
  }
  return 0;
}

$ gcc -O2 -o mmap_cat_page mmap_cat_page.c
$ mkdir lowerdir upperdir workdir overlaydir
$ echo old > lowerdir/file
$ sudo mount -t overlay -o "lowerdir=lowerdir,upperdir=upperdir,workdir=workdir" none overlaydir
$ ./mmap_cat_page overlaydir/file
old
^Z
[1]+  Stopped                 ./mmap_cat_page overlaydir/file
$ echo new > overlaydir/file
$ cat overlaydir/file
new
$ fg
./mmap_cat_page overlaydir/file

old
```

Therefore, while the VFS1 gofer client's behavior of reopening read FDs is only
necessary pre-4.18, replacing existing memory mappings (in both sentry and
application address spaces) with mappings of the new FD is required regardless
of kernel version, and this latter behavior is common to both VFS1 and VFS2.
Re-document accordingly, and change the runsc flag to enabled by default.

New test:
- Before this CL: https://source.cloud.google.com/results/invocations/5b222d2c-e918-4bae-afc4-407f5bac509b
- After this CL: https://source.cloud.google.com/results/invocations/f28c747e-d89c-4d8c-a461-602b33e71aab

PiperOrigin-RevId: 311361267
2020-05-13 10:53:37 -07:00
Fabricio Voznika e2b0e0e272 Enable TestRunNonRoot on VFS2
Also added back the default test dimension back which was
dropped in a previous refactor.

PiperOrigin-RevId: 309797327
2020-05-04 12:29:03 -07:00
Fabricio Voznika cbc5bef2a6 Add TTY support on VFS2 to runsc
Updates #1623, #1487

PiperOrigin-RevId: 309777922
2020-05-04 10:59:20 -07:00
gVisor bot d5c34ba2ff Merge pull request #2487 from moricho:fix/bindmount
PiperOrigin-RevId: 309082540
2020-04-29 13:13:51 -07:00
gVisor bot ceb3c0e062 Merge pull request #2558 from prattmic:forward_signal
PiperOrigin-RevId: 308829800
2020-04-28 08:43:49 -07:00
Michael Pratt b15d49a137 container: use sighandling package
Use the sighandling package for Container.ForwardSignals, for
consistency with other signal forwarding.

Fixes #2546
2020-04-27 11:52:43 -04:00
kevin.xu 9a4ae0322e
Update container.go
typo, should be `start` in comments
2020-04-27 21:53:04 +08:00
moricho fc53d64367 refactor and add test for bindmount
Signed-off-by: moricho <ikeda.morito@gmail.com>
2020-04-26 17:24:34 +09:00
Zach Koopmans 17ac90a203 Add container tests passing with VFS2
Several tests are passing after getting TestAppExitStatus (run /bin/true)
changes. Make versions that run via VFS2 so that we know what is and isn't
working.

In addition, fix bug in VFSFile ReadFull. For the TestExePath test in
container_test.go, the case "unmasked" will return 0 bytes read with no
EOF err, causing the ReadFull call to spin.

PiperOrigin-RevId: 308428126
2020-04-25 11:27:23 -07:00