Commit Graph

674 Commits

Author SHA1 Message Date
Ting-Yu Wang 41da7a568b Fix copylocks error about copying IPTables.
IPTables.connections contains a sync.RWMutex. Copying it will trigger copylocks
analysis. Tested by manually enabling nogo tests.

sync.RWMutex is added to IPTables for the additional race condition discovered.

PiperOrigin-RevId: 314817019
2020-06-05 11:29:09 -07:00
Fabricio Voznika ca5912d13c More runsc changes for VFS2
- Add /tmp handling
- Apply mount options
- Enable more container_test tests
- Forward signals to child process when test respaws process
  to run as root inside namespace.

Updates #1487

PiperOrigin-RevId: 314263281
2020-06-01 21:32:09 -07:00
Jamie Liu 3a987160aa Handle gofer blocking opens of host named pipes in VFS2.
Using tee instead of read to detect when a O_RDONLY|O_NONBLOCK pipe FD has a
writer circumvents the problem of what to do with the byte read from the pipe,
avoiding much of the complexity of the fdpipe package.

PiperOrigin-RevId: 314216146
2020-06-01 15:33:30 -07:00
Michael Pratt 12f74bd6f6 Include runtime goroutines in panics
SetTraceback("all") does not include all goroutines in panics (you didn't think
it was that simple, did you?). It includes all _user_ goroutines; those started
by the runtime (such as GC workers) are excluded.

Switch to "system" to additionally include runtime goroutines, which are useful
to track down bugs in the runtime itself.

PiperOrigin-RevId: 314204473
2020-06-01 14:32:19 -07:00
Fabricio Voznika 16100d18cb Make gofer mount readonly when overlay is enabled
No writes are expected to the underlying filesystem when
using --overlay.

PiperOrigin-RevId: 314171457
2020-06-01 11:44:32 -07:00
Nicolas Lacasse 93edb36cbb Refactor the ResolveExecutablePath logic.
PiperOrigin-RevId: 313871804
2020-05-29 16:35:21 -07:00
gVisor bot f498e46ef9 Merge pull request #2767 from mikaelmello:add-cwd-option-spec
PiperOrigin-RevId: 313828906
2020-05-29 12:34:45 -07:00
Fabricio Voznika f7418e2159 Move Cleanup to its own package
PiperOrigin-RevId: 313663382
2020-05-28 14:49:06 -07:00
Fabricio Voznika a8c1b32660 Automated rollback of changelist 309082540
PiperOrigin-RevId: 313636920
2020-05-28 12:25:57 -07:00
Mikael Mello 9e8000e9fb
Add cwd option to spec cmd 2020-05-24 17:44:03 -03:00
Fabricio Voznika 10abad0040 Add hugetlb and rdma cgroups to runsc
Updates #2713

PiperOrigin-RevId: 312559463
2020-05-20 14:49:13 -07:00
Fabricio Voznika 32ab382c80 Improve unsupported syscall message
PiperOrigin-RevId: 312104899
2020-05-18 10:23:22 -07:00
Adin Scannell 420b791a3d Minor formatting updates for gvisor.dev.
* Aggregate architecture Overview in "What is gVisor?" as it makes more sense
  in one place.

* Drop "user-space kernel" and use "application kernel". The term "user-space
  kernel" is confusing when some platform implementation do not run in
  user-space (instead running in guest ring zero).

* Clear up the relationship between the Platform page in the user guide and the
  Platform page in the architecture guide, and ensure they are cross-linked.

* Restore the call-to-action quick start link in the main page, and drop the
  GitHub link (which also appears in the top-right).

* Improve image formatting by centering all doc and blog images, and move the
  image captions to the alt text.

PiperOrigin-RevId: 311845158
2020-05-15 20:05:18 -07:00
Jamie Liu 64afaf0e9b Fix runsc association of gofers and FDs on VFS2.
Updates #1487

PiperOrigin-RevId: 311443628
2020-05-13 18:18:09 -07:00
Jamie Liu d846077628 Enable overlayfs_stale_read by default for runsc.
Linux 4.18 and later make reads and writes coherent between pre-copy-up and
post-copy-up FDs representing the same file on an overlay filesystem. However,
memory mappings remain incoherent:

- Documentation/filesystems/overlayfs.rst, "Non-standard behavior": "If a file
  residing on a lower layer is opened for read-only and then memory mapped with
  MAP_SHARED, then subsequent changes to the file are not reflected in the
  memory mapping."

- fs/overlay/file.c:ovl_mmap() passes through to the underlying FD without any
  management of coherence in the overlay.

- Experimentally on Linux 5.2:

```
$ cat mmap_cat_page.c
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>

int main(int argc, char **argv) {
  if (argc < 2) {
    errx(1, "syntax: %s [FILE]", argv[0]);
  }
  const int fd = open(argv[1], O_RDONLY);
  if (fd < 0) {
    err(1, "open(%s)", argv[1]);
  }
  const size_t page_size = sysconf(_SC_PAGE_SIZE);
  void* page = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0);
  if (page == MAP_FAILED) {
    err(1, "mmap");
  }
  for (;;) {
    write(1, page, strnlen(page, page_size));
    if (getc(stdin) == EOF) {
      break;
    }
  }
  return 0;
}

$ gcc -O2 -o mmap_cat_page mmap_cat_page.c
$ mkdir lowerdir upperdir workdir overlaydir
$ echo old > lowerdir/file
$ sudo mount -t overlay -o "lowerdir=lowerdir,upperdir=upperdir,workdir=workdir" none overlaydir
$ ./mmap_cat_page overlaydir/file
old
^Z
[1]+  Stopped                 ./mmap_cat_page overlaydir/file
$ echo new > overlaydir/file
$ cat overlaydir/file
new
$ fg
./mmap_cat_page overlaydir/file

old
```

Therefore, while the VFS1 gofer client's behavior of reopening read FDs is only
necessary pre-4.18, replacing existing memory mappings (in both sentry and
application address spaces) with mappings of the new FD is required regardless
of kernel version, and this latter behavior is common to both VFS1 and VFS2.
Re-document accordingly, and change the runsc flag to enabled by default.

New test:
- Before this CL: https://source.cloud.google.com/results/invocations/5b222d2c-e918-4bae-afc4-407f5bac509b
- After this CL: https://source.cloud.google.com/results/invocations/f28c747e-d89c-4d8c-a461-602b33e71aab

PiperOrigin-RevId: 311361267
2020-05-13 10:53:37 -07:00
Fabricio Voznika 18cb3d24cb Use VFS2 mount names
Updates #1487

PiperOrigin-RevId: 311356385
2020-05-13 10:31:29 -07:00
Fabricio Voznika 305f786e51 Adjust a few log messages
PiperOrigin-RevId: 311234146
2020-05-12 17:26:07 -07:00
Bhasker Hariharan e838e7ab34 Automated rollback of changelist 310417191
PiperOrigin-RevId: 310963404
2020-05-11 12:09:06 -07:00
Nicolas Lacasse c52195d258 Stop avoiding preadv2 and pwritev2, and add them to the filters.
Some code paths needed these syscalls anyways, so they should be included in
the filters. Given that we depend on these syscalls in some cases, there's no
real reason to avoid them any more.

PiperOrigin-RevId: 310829126
2020-05-10 17:52:20 -07:00
Jamie Liu 9115f26851 Allocate device numbers for VFS2 filesystems.
Updates #1197, #1198, #1672

PiperOrigin-RevId: 310432006
2020-05-07 14:01:53 -07:00
Bhasker Hariharan 28b5565fdd Automated rollback of changelist 309339316
PiperOrigin-RevId: 310417191
2020-05-07 12:48:23 -07:00
Dean Deng 16da7e790f Update privateunixsocket TODOs.
Synthetic sockets do not have the race condition issue in VFS2, and we will
get rid of privateunixsocket as well.

Fixes #1200.

PiperOrigin-RevId: 310386474
2020-05-07 10:20:48 -07:00
Adin Scannell 279f1eb7ab Fix runsc syscall documentation generation.
We can register any number of tables with any number of architectures, and
need not limit the definitions to the architecture in question. This allows
runsc to generate documentation for all architectures simultaneously.

Similarly, this simplifies the VFSv2 patching process.

PiperOrigin-RevId: 310224827
2020-05-06 14:13:48 -07:00
Fabricio Voznika e2b0e0e272 Enable TestRunNonRoot on VFS2
Also added back the default test dimension back which was
dropped in a previous refactor.

PiperOrigin-RevId: 309797327
2020-05-04 12:29:03 -07:00
Fabricio Voznika 0a307d0072 Mount VSFS2 filesystem using root credentials
PiperOrigin-RevId: 309787938
2020-05-04 11:48:00 -07:00
Fabricio Voznika cbc5bef2a6 Add TTY support on VFS2 to runsc
Updates #1623, #1487

PiperOrigin-RevId: 309777922
2020-05-04 10:59:20 -07:00
Bhasker Hariharan 8962b7840f Enable FIFO QDisc by default in runsc.
Updates #231

PiperOrigin-RevId: 309339316
2020-04-30 18:29:57 -07:00
Bhasker Hariharan ae15d90436 FIFO QDisc implementation
Updates #231

PiperOrigin-RevId: 309323808
2020-04-30 16:41:00 -07:00
gVisor bot d5c34ba2ff Merge pull request #2487 from moricho:fix/bindmount
PiperOrigin-RevId: 309082540
2020-04-29 13:13:51 -07:00
gVisor bot ceb3c0e062 Merge pull request #2558 from prattmic:forward_signal
PiperOrigin-RevId: 308829800
2020-04-28 08:43:49 -07:00
gVisor bot 316394ee89 Merge pull request #2544 from prattmic:runsc_do_cleanup
PiperOrigin-RevId: 308727526
2020-04-27 17:01:33 -07:00
Michael Pratt 147c8ba1f7 runsc: extend do network cleanup
Previously we unconditionally failed to cleanup the networking files
(hostname, resolve.conf, hosts), and failed to cleanup the netns, etc on
partial setup failure.

We can drop the iptables commands from cleanup, as the routes
automatically go away when the device is deleted. Those commands were
failing previously.

Forward signals to the container, allowing it to exit normally when a
signal is received, and then for runsc to run the cleanup. This doesn't
cover cleanup when runsc is signalled before the container start, it
covers the most common case.

Fixes #2539
Fixes #2540
2020-04-27 16:36:07 -04:00
Michael Pratt b15d49a137 container: use sighandling package
Use the sighandling package for Container.ForwardSignals, for
consistency with other signal forwarding.

Fixes #2546
2020-04-27 11:52:43 -04:00
kevin.xu 9a4ae0322e
Update container.go
typo, should be `start` in comments
2020-04-27 21:53:04 +08:00
moricho fc53d64367 refactor and add test for bindmount
Signed-off-by: moricho <ikeda.morito@gmail.com>
2020-04-26 17:24:34 +09:00
Zach Koopmans 17ac90a203 Add container tests passing with VFS2
Several tests are passing after getting TestAppExitStatus (run /bin/true)
changes. Make versions that run via VFS2 so that we know what is and isn't
working.

In addition, fix bug in VFSFile ReadFull. For the TestExePath test in
container_test.go, the case "unmasked" will return 0 bytes read with no
EOF err, causing the ReadFull call to spin.

PiperOrigin-RevId: 308428126
2020-04-25 11:27:23 -07:00
moricho 0b3166f624 add bind/rbind options for mount
Signed-off-by: moricho <ikeda.morito@gmail.com>
2020-04-25 22:04:39 +09:00
moricho 93e510e26f fix behavior of `getMountNameAndOptions` when options include either bind or rbind
Signed-off-by: moricho <ikeda.morito@gmail.com>
2020-04-25 22:04:39 +09:00
Zach Koopmans 15a822a193 VFS2: Get HelloWorld image tests to pass with VFS2
This change includes:
- Modifications to loader_test.go to get TestCreateMountNamespace to
pass with VFS2.
- Changes necessary to get TestHelloWorld in image tests to pass with
VFS2. This means runsc can run the hello-world container with docker
on VSF2.

Note: Containers that use sockets will not run with these changes.
See "//test/image/...". Any tests here with sockets currently fail
(which is all of them but HelloWorld).
PiperOrigin-RevId: 308363072
2020-04-24 18:23:37 -07:00
Fabricio Voznika 4af39dd1c5 Propagate PID limit from OCI to sandbox cgroup
Closes #2489

PiperOrigin-RevId: 308362434
2020-04-24 18:17:01 -07:00
Dean Deng 632b104aff Plumb context.Context into kernfs.Inode.Open().
PiperOrigin-RevId: 308304793
2020-04-24 12:37:49 -07:00
Dean Deng 1b88c63b3e Move hostfs mount to Kernel struct.
This is needed to set up host fds passed through a Unix socket. Note that
the host package depends on kernel, so we cannot set up the hostfs mount
directly in Kernel.Init as we do for sockfs and pipefs.

Also, adjust sockfs to make its setup look more like hostfs's and pipefs's.

PiperOrigin-RevId: 308274053
2020-04-24 10:03:43 -07:00
Jamie Liu 5042ea7e2c Add vfs.MkdirOptions.ForSyntheticMountpoint.
PiperOrigin-RevId: 308143529
2020-04-23 15:37:10 -07:00
Adin Scannell 1481499fe2 Simplify Docker test infrastructure.
This change adds a layer of abstraction around the internal Docker APIs,
and eliminates all direct dependencies on Dockerfiles in the infrastructure.

A subsequent change will automated the generation of local images (with
efficient caching). Note that this change drops the use of bazel container
rules, as that experiment does not seem to be viable.

PiperOrigin-RevId: 308095430
2020-04-23 11:33:30 -07:00
Nicolas Lacasse e69a871c7b Move user home detection to its own library.
PiperOrigin-RevId: 307977689
2020-04-22 22:18:21 -07:00
Andrei Vagin 0c586946ea Specify a memory file in platform.New().
PiperOrigin-RevId: 307941984
2020-04-22 17:50:10 -07:00
Adin Scannell 1a597e01be Add a functional vm_test for root_test.
This change renames the tools/images directory to tools/vm for clarity, and
adds a functional vm_test. Sharding is also added to the same test, and some
documentation added around key flags & variables to describe how they work.

Subsequent changes will add vm_tests for other cases, such as the runtime tests.

PiperOrigin-RevId: 307492245
2020-04-20 15:48:27 -07:00
Fabricio Voznika a80cd43023 Add test name to boot and gofer log files
This is to make easier to find corresponding logs in
case test fails.

PiperOrigin-RevId: 307104283
2020-04-17 13:28:54 -07:00
Zach Koopmans 12bde95635 Get /bin/true to run on VFS2
Included:
- loader_test.go RunTest and TestStartSignal VFS2
- container_test.go TestAppExitStatus on VFS2
- experimental flag added to runsc to turn on VFS2

Note: shared mounts are not yet supported.
PiperOrigin-RevId: 307070753
2020-04-17 10:39:19 -07:00
Fabricio Voznika 5a8ee1beee Preserve log FD after execve
PiperOrigin-RevId: 306908296
2020-04-16 13:17:00 -07:00