163ab5e9ba
Major differences from the current ("v1") sentry VFS: - Path resolution is Filesystem-driven (FilesystemImpl methods call vfs.ResolvingPath methods) rather than VFS-driven (fs package owns a Dirent tree and calls fs.InodeOperations methods to populate it). This drastically improves performance, primarily by reducing overhead from inefficient synchronization and indirection. It also makes it possible to implement remote filesystem protocols that translate FS system calls into single RPCs, rather than having to make (at least) one RPC per path component, significantly reducing the latency of remote filesystems (especially during cold starts and for uncacheable shared filesystems). - Mounts are correctly represented as a separate check based on contextual state (current mount) rather than direct replacement in a fs.Dirent tree. This makes it possible to support (non-recursive) bind mounts and mount namespaces. Included in this CL is fsimpl/memfs, an incomplete in-memory filesystem that exists primarily to demonstrate intended filesystem implementation patterns and for benchmarking: BenchmarkVFS1TmpfsStat/1-6 3000000 497 ns/op BenchmarkVFS1TmpfsStat/2-6 2000000 676 ns/op BenchmarkVFS1TmpfsStat/3-6 2000000 904 ns/op BenchmarkVFS1TmpfsStat/8-6 1000000 1944 ns/op BenchmarkVFS1TmpfsStat/64-6 100000 14067 ns/op BenchmarkVFS1TmpfsStat/100-6 50000 21700 ns/op BenchmarkVFS2MemfsStat/1-6 10000000 197 ns/op BenchmarkVFS2MemfsStat/2-6 5000000 233 ns/op BenchmarkVFS2MemfsStat/3-6 5000000 268 ns/op BenchmarkVFS2MemfsStat/8-6 3000000 477 ns/op BenchmarkVFS2MemfsStat/64-6 500000 2592 ns/op BenchmarkVFS2MemfsStat/100-6 300000 4045 ns/op BenchmarkVFS1TmpfsMountStat/1-6 2000000 679 ns/op BenchmarkVFS1TmpfsMountStat/2-6 2000000 912 ns/op BenchmarkVFS1TmpfsMountStat/3-6 1000000 1113 ns/op BenchmarkVFS1TmpfsMountStat/8-6 1000000 2118 ns/op BenchmarkVFS1TmpfsMountStat/64-6 100000 14251 ns/op BenchmarkVFS1TmpfsMountStat/100-6 100000 22397 ns/op BenchmarkVFS2MemfsMountStat/1-6 5000000 317 ns/op BenchmarkVFS2MemfsMountStat/2-6 5000000 361 ns/op BenchmarkVFS2MemfsMountStat/3-6 5000000 387 ns/op BenchmarkVFS2MemfsMountStat/8-6 3000000 582 ns/op BenchmarkVFS2MemfsMountStat/64-6 500000 2699 ns/op BenchmarkVFS2MemfsMountStat/100-6 300000 4133 ns/op From this we can infer that, on this machine: - Constant cost for tmpfs stat() is ~160ns in VFS2 and ~280ns in VFS1. - Per-path-component cost is ~35ns in VFS2 and ~215ns in VFS1, a difference of about 6x. - The cost of crossing a mount boundary is about 80ns in VFS2 (MemfsMountStat/1 does approximately the same amount of work as MemfsStat/2, except that it also crosses a mount boundary). This is an inescapable cost of the separate mount lookup needed to support bind mounts and mount namespaces. PiperOrigin-RevId: 258853946 |
||
---|---|---|
.github | ||
cloudbuild | ||
g3doc | ||
kokoro | ||
pkg | ||
runsc | ||
test | ||
third_party/gvsync | ||
tools | ||
vdso | ||
.bazelrc | ||
.gitignore | ||
AUTHORS | ||
BUILD | ||
CODE_OF_CONDUCT.md | ||
CONTRIBUTING.md | ||
Dockerfile | ||
LICENSE | ||
Makefile | ||
README.md | ||
WORKSPACE | ||
go.mod | ||
go.sum |
README.md
What is gVisor?
gVisor is a user-space kernel, written in Go, that implements a substantial
portion of the Linux system surface. It includes an
Open Container Initiative (OCI) runtime called runsc
that provides an
isolation boundary between the application and the host kernel. The runsc
runtime integrates with Docker and Kubernetes, making it simple to run sandboxed
containers.
Why does gVisor exist?
Containers are not a sandbox. While containers have revolutionized how we develop, package, and deploy applications, running untrusted or potentially malicious code without additional isolation is not a good idea. The efficiency and performance gains from using a single, shared kernel also mean that container escape is possible with a single vulnerability.
gVisor is a user-space kernel for containers. It limits the host kernel surface accessible to the application while still giving the application access to all the features it expects. Unlike most kernels, gVisor does not assume or require a fixed set of physical resources; instead, it leverages existing host kernel functionality and runs as a normal user-space process. In other words, gVisor implements Linux by way of Linux.
gVisor should not be confused with technologies and tools to harden containers against external threats, provide additional integrity checks, or limit the scope of access for a service. One should always be careful about what data is made available to a container.
Documentation
User documentation and technical architecture, including quick start guides, can be found at gvisor.dev.
Installing from source
gVisor currently requires x86_64 Linux to build, though support for other architectures may become available in the future.
Requirements
Make sure the following dependencies are installed:
- Linux 4.14.77+ (older linux)
- git
- Bazel 0.23.0+
- Python
- Docker version 17.09.0 or greater
- Gold linker (e.g.
binutils-gold
package on Ubuntu)
Building
Build and install the runsc
binary:
bazel build runsc
sudo cp ./bazel-bin/runsc/linux_amd64_pure_stripped/runsc /usr/local/bin
If you don't want to install bazel on your system, you can build runsc in a Docker container:
make runsc
sudo cp ./bazel-bin/runsc/linux_amd64_pure_stripped/runsc /usr/local/bin
Testing
The test suite can be run with Bazel:
bazel test //...
or in a Docker container:
make unit-tests
make tests
Using remote execution
If you have a Remote Build Execution environment, you can use it to speed up build and test cycles.
You must authenticate with the project first:
gcloud auth application-default login --no-launch-browser
Then invoke bazel with the following flags:
--config=remote
--project_id=$PROJECT
--remote_instance_name=projects/$PROJECT/instances/default_instance
You can also add those flags to your local ~/.bazelrc to avoid needing to specify them each time on the command line.
Using go get
This project uses bazel to build and manage dependencies. A synthetic
go
branch is maintained that is compatible with standard go
tooling for
convenience.
For example, to build runsc
directly from this branch:
echo "module runsc" > go.mod
GO111MODULE=on go get gvisor.dev/gvisor/runsc@go
CGO_ENABLED=0 GO111MODULE=on go install gvisor.dev/gvisor/runsc
Note that this branch is supported in a best effort capacity, and direct
development on this branch is not supported. Development should occur on the
master
branch, which is then reflected into the go
branch.
Community & Governance
The governance model is documented in our community repository.
The gvisor-users mailing list and gvisor-dev mailing list are good starting points for questions and discussion.
Security
Sensitive security-related questions, comments and disclosures can be sent to the gvisor-security mailing list. The full security disclosure policy is defined in the community repository.
Contributing
See Contributing.md.