gvisor/runsc/boot
Jamie Liu 5c4f4ed9eb Skip /dev submount hack on VFS2.
containerd usually configures both /dev and /dev/shm as tmpfs mounts, e.g.:

```
  "mounts": [
    ...
    {
      "destination": "/dev",
      "type": "tmpfs",
      "source": "/run/containerd/io.containerd.runtime.v2.task/moby/10eedbd6a0e7937ddfcab90f2c25bd9a9968b734c4ae361318142165d445e67e/tmpfs",
      "options": [
        "nosuid",
        "strictatime",
        "mode=755",
        "size=65536k"
      ]
    },
    ...
    {
      "destination": "/dev/shm",
      "type": "tmpfs",
      "source": "/run/containerd/io.containerd.runtime.v2.task/moby/10eedbd6a0e7937ddfcab90f2c25bd9a9968b734c4ae361318142165d445e67e/shm",
      "options": [
        "nosuid",
        "noexec",
        "nodev",
        "mode=1777",
        "size=67108864"
      ]
    },
    ...
```

(This is mostly consistent with how Linux is usually configured, except that
/dev is conventionally devtmpfs, not regular tmpfs. runc/libcontainer
implements OCI-runtime-spec-undocumented behavior to create
/dev/{ptmx,fd,stdin,stdout,stderr} in non-bind /dev mounts. runsc silently
switches /dev to devtmpfs. In VFS1, this is necessary to get device files like
/dev/null at all, since VFS1 doesn't support real device special files, only
what is hardcoded in devfs. VFS2 does support device special files, but using
devtmpfs is the easiest way to get pre-created files in /dev.)

runsc ignores many /dev submounts in the spec, including /dev/shm. In VFS1,
this appears to be to avoid introducing a submount overlay for /dev, and is
mostly fine since the typical mode for the /dev/shm mount is ~consistent with
the mode of the /dev/shm directory provided by devfs (modulo the sticky bit).
In VFS2, this is vestigial (VFS2 does not use submount overlays), and devtmpfs'
/dev/shm mode is correct for the mount point but not the mount. So turn off
this behavior for VFS2.

After this change:

```
$ docker run --rm -it ubuntu:focal ls -lah /dev/shm
total 0
drwxrwxrwt 2 root root  40 Mar 18 00:16 .
drwxr-xr-x 5 root root 360 Mar 18 00:16 ..

$ docker run --runtime=runsc --rm -it ubuntu:focal ls -lah /dev/shm
total 0
drwxrwxrwx 1 root root 0 Mar 18 00:16 .
dr-xr-xr-x 1 root root 0 Mar 18 00:16 ..

$ docker run --runtime=runsc-vfs2 --rm -it ubuntu:focal ls -lah /dev/shm
total 0
drwxrwxrwt 2 root root  40 Mar 18 00:16 .
drwxr-xr-x 5 root root 320 Mar 18 00:16 ..
```

Fixes #5687

PiperOrigin-RevId: 363699385
2021-03-18 11:12:43 -07:00
..
filter [op] Replace syscall package usage with golang.org/x/sys/unix in runsc/. 2021-03-06 22:07:07 -08:00
platforms Standardize on tools directory. 2020-01-27 12:21:00 -08:00
pprof Initial network namespace support. 2020-02-20 15:20:40 -08:00
BUILD Internal changes. 2021-01-05 09:53:42 -08:00
compat.go [op] Replace syscall package usage with golang.org/x/sys/unix in runsc/. 2021-03-06 22:07:07 -08:00
compat_amd64.go [op] Replace syscall package usage with golang.org/x/sys/unix in runsc/. 2021-03-06 22:07:07 -08:00
compat_arm64.go Improve unsupported syscall message 2020-05-18 10:23:22 -07:00
compat_test.go Merge pull request #1233 from xiaobo55x:compatLog 2019-12-06 19:41:39 -08:00
controller.go Skip /dev submount hack on VFS2. 2021-03-18 11:12:43 -07:00
debug.go Update canonical repository. 2019-06-13 16:50:15 -07:00
events.go Stub out basic `runsc events --stat` CPU functionality 2021-02-02 12:47:23 -08:00
fs.go Skip /dev submount hack on VFS2. 2021-03-18 11:12:43 -07:00
fs_test.go Move boot.Config to its own package 2020-08-19 18:37:42 -07:00
limits.go [op] Replace syscall package usage with golang.org/x/sys/unix in runsc/. 2021-03-06 22:07:07 -08:00
loader.go Skip /dev submount hack on VFS2. 2021-03-18 11:12:43 -07:00
loader_test.go Skip /dev submount hack on VFS2. 2021-03-18 11:12:43 -07:00
network.go [op] Replace syscall package usage with golang.org/x/sys/unix in runsc/. 2021-03-06 22:07:07 -08:00
strace.go Make flag propagation automatic 2020-08-26 20:24:41 -07:00
vfs.go Overlay runsc regular file mounts with regular files. 2020-12-04 19:13:24 -08:00