Go to file
Jamie Liu 492229d017 VFS2 gofer client
Updates #1198

Opening host pipes (by spinning in fdpipe) and host sockets is not yet
complete, and will be done in a future CL.

Major differences from VFS1 gofer client (sentry/fs/gofer), with varying levels
of backportability:

- "Cache policies" are replaced by InteropMode, which control the behavior of
  timestamps in addition to caching. Under InteropModeExclusive (analogous to
  cacheAll) and InteropModeWritethrough (analogous to cacheAllWritethrough),
  client timestamps are *not* written back to the server (it is not possible in
  9P or Linux for clients to set ctime, so writing back client-authoritative
  timestamps results in incoherence between atime/mtime and ctime). Under
  InteropModeShared (analogous to cacheRemoteRevalidating), client timestamps
  are not used at all (remote filesystem clocks are authoritative). cacheNone
  is translated to InteropModeShared + new option
  filesystemOptions.specialRegularFiles.

- Under InteropModeShared, "unstable attribute" reloading for permission
  checks, lookup, and revalidation are fused, which is feasible in VFS2 since
  gofer.filesystem controls path resolution. This results in a ~33% reduction
  in RPCs for filesystem operations compared to cacheRemoteRevalidating. For
  example, consider stat("/foo/bar/baz") where "/foo/bar/baz" fails
  revalidation, resulting in the instantiation of a new dentry:

  VFS1 RPCs:
  getattr("/")                          // fs.MountNamespace.FindLink() => fs.Inode.CheckPermission() => gofer.inodeOperations.check() => gofer.inodeOperations.UnstableAttr()
  walkgetattr("/", "foo") = fid1        // fs.Dirent.walk() => gofer.session.Revalidate() => gofer.cachePolicy.Revalidate()
  clunk(fid1)
  getattr("/foo")                       // CheckPermission
  walkgetattr("/foo", "bar") = fid2     // Revalidate
  clunk(fid2)
  getattr("/foo/bar")                   // CheckPermission
  walkgetattr("/foo/bar", "baz") = fid3 // Revalidate
  clunk(fid3)
  walkgetattr("/foo/bar", "baz") = fid4 // fs.Dirent.walk() => gofer.inodeOperations.Lookup
  getattr("/foo/bar/baz")               // linux.stat() => gofer.inodeOperations.UnstableAttr()

  VFS2 RPCs:
  getattr("/")                          // gofer.filesystem.walkExistingLocked()
  walkgetattr("/", "foo") = fid1        // gofer.filesystem.stepExistingLocked()
  clunk(fid1)
                                        // No getattr: walkgetattr already updated metadata for permission check
  walkgetattr("/foo", "bar") = fid2
  clunk(fid2)
  walkgetattr("/foo/bar", "baz") = fid3
                                        // No clunk: fid3 used for new gofer.dentry
                                        // No getattr: walkgetattr already updated metadata for stat()

- gofer.filesystem.unlinkAt() does not require instantiation of a dentry that
  represents the file to be deleted. Updates #898.

- gofer.regularFileFD.OnClose() skips Tflushf for regular files under
  InteropModeExclusive, as it's nonsensical to request a remote file flush
  without flushing locally-buffered writes to that remote file first.

- Symlink targets are cached when InteropModeShared is not in effect.

- p9.QID.Path (which is already required to be unique for each file within a
  server, and is accordingly already synthesized from device/inode numbers in
  all known gofers) is used as-is for inode numbers, rather than being mapped
  along with attr.RDev in the client to yet another synthetic inode number.

- Relevant parts of fsutil.CachingInodeOperations are inlined directly into
  gofer package code. This avoids having to duplicate part of its functionality
  in fsutil.HostMappable.

PiperOrigin-RevId: 293190213
2020-02-04 11:29:22 -08:00
.github Update CONTRIBUTING.md 2019-05-30 12:09:10 -07:00
benchmarks Standardize on tools directory. 2020-01-27 12:21:00 -08:00
g3doc Merge pull request #306 from amscanne:add_readme 2019-06-13 17:20:49 -07:00
kokoro Add 1 Kokoro job per runtime test. 2020-02-03 15:56:57 -08:00
pkg VFS2 gofer client 2020-02-04 11:29:22 -08:00
runsc Reduce run time for //test/syscalls:socket_inet_loopback_test_runsc_ptrace. 2020-02-03 15:42:21 -08:00
scripts Add 1 Kokoro job per runtime test. 2020-02-03 15:56:57 -08:00
test Add support for sentry internal pipe for gofer mounts 2020-02-04 08:20:52 -08:00
tools Simplify testing link rules. 2020-01-30 17:49:17 -08:00
vdso Prefer Type& over Type & 2020-01-28 11:18:17 -08:00
.bazelrc Standardize on tools directory. 2020-01-27 12:21:00 -08:00
.gitignore Add .gitignore 2018-05-01 09:37:49 -04:00
AUTHORS Change copyright notice to "The gVisor Authors" 2019-04-29 14:26:23 -07:00
BUILD Standardize on tools directory. 2020-01-27 12:21:00 -08:00
CODE_OF_CONDUCT.md Adds Code of Conduct 2018-12-14 18:13:52 -08:00
CONTRIBUTING.md Fix header ordering and format all C++ code. 2020-01-27 18:27:20 -08:00
Dockerfile Install python2 in the Dockerfile. 2019-12-19 15:26:00 -08:00
LICENSE Check in gVisor. 2018-04-28 01:44:26 -04:00
Makefile Allow non-unique UIDs in bazel docker containers 2019-12-02 18:00:33 -08:00
README.md Build with C++17 2019-12-06 15:26:47 -08:00
SECURITY.md Add SECURITY.md. 2019-10-06 21:08:11 -07:00
WORKSPACE Toolchain version bumps. 2020-01-22 11:52:38 -08:00
go.mod Github bug reviver 2020-01-08 16:06:40 -08:00
go.sum Github bug reviver 2020-01-08 16:06:40 -08:00

README.md

gVisor

Status gVisor chat

What is gVisor?

gVisor is a user-space kernel, written in Go, that implements a substantial portion of the Linux system surface. It includes an Open Container Initiative (OCI) runtime called runsc that provides an isolation boundary between the application and the host kernel. The runsc runtime integrates with Docker and Kubernetes, making it simple to run sandboxed containers.

Why does gVisor exist?

Containers are not a sandbox. While containers have revolutionized how we develop, package, and deploy applications, running untrusted or potentially malicious code without additional isolation is not a good idea. The efficiency and performance gains from using a single, shared kernel also mean that container escape is possible with a single vulnerability.

gVisor is a user-space kernel for containers. It limits the host kernel surface accessible to the application while still giving the application access to all the features it expects. Unlike most kernels, gVisor does not assume or require a fixed set of physical resources; instead, it leverages existing host kernel functionality and runs as a normal user-space process. In other words, gVisor implements Linux by way of Linux.

gVisor should not be confused with technologies and tools to harden containers against external threats, provide additional integrity checks, or limit the scope of access for a service. One should always be careful about what data is made available to a container.

Documentation

User documentation and technical architecture, including quick start guides, can be found at gvisor.dev.

Installing from source

gVisor currently requires x86_64 Linux to build, though support for other architectures may become available in the future.

Requirements

Make sure the following dependencies are installed:

Building

Build and install the runsc binary:

bazel build runsc
sudo cp ./bazel-bin/runsc/linux_amd64_pure_stripped/runsc /usr/local/bin

If you don't want to install bazel on your system, you can build runsc in a Docker container:

make runsc
sudo cp ./bazel-bin/runsc/linux_amd64_pure_stripped/runsc /usr/local/bin

Testing

The test suite can be run with Bazel:

bazel test //...

or in a Docker container:

make unit-tests
make tests

Using remote execution

If you have a Remote Build Execution environment, you can use it to speed up build and test cycles.

You must authenticate with the project first:

gcloud auth application-default login --no-launch-browser

Then invoke bazel with the following flags:

--config=remote
--project_id=$PROJECT
--remote_instance_name=projects/$PROJECT/instances/default_instance

You can also add those flags to your local ~/.bazelrc to avoid needing to specify them each time on the command line.

Using go get

This project uses bazel to build and manage dependencies. A synthetic go branch is maintained that is compatible with standard go tooling for convenience.

For example, to build runsc directly from this branch:

echo "module runsc" > go.mod
GO111MODULE=on go get gvisor.dev/gvisor/runsc@go
CGO_ENABLED=0 GO111MODULE=on go install gvisor.dev/gvisor/runsc

Note that this branch is supported in a best effort capacity, and direct development on this branch is not supported. Development should occur on the master branch, which is then reflected into the go branch.

Community & Governance

The governance model is documented in our community repository.

The gvisor-users mailing list and gvisor-dev mailing list are good starting points for questions and discussion.

Security Policy

See SECURITY.md.

Contributing

See Contributing.md.