2020-04-28 05:24:58 +00:00
|
|
|
# Platform Guide
|
2019-11-18 21:40:27 +00:00
|
|
|
|
2019-04-02 17:26:01 +00:00
|
|
|
A gVisor sandbox consists of multiple processes when running. These processes
|
2019-03-30 02:40:11 +00:00
|
|
|
collectively comprise a shared environment in which one or more containers can
|
|
|
|
be run.
|
|
|
|
|
|
|
|
Each sandbox has its own isolated instance of:
|
|
|
|
|
2020-05-12 19:55:23 +00:00
|
|
|
* The **Sentry**, A user-space kernel that runs the container and intercepts
|
|
|
|
and responds to system calls made by the application.
|
2019-03-30 02:40:11 +00:00
|
|
|
|
|
|
|
Each container running in the sandbox has its own isolated instance of:
|
|
|
|
|
2020-05-12 19:55:23 +00:00
|
|
|
* A **Gofer** which provides file system access to the container.
|
2019-03-30 02:40:11 +00:00
|
|
|
|
2020-04-28 05:24:58 +00:00
|
|
|
![gVisor architecture diagram](Sentry-Gofer.png "gVisor architecture diagram")
|
2019-03-30 02:40:11 +00:00
|
|
|
|
|
|
|
## runsc
|
|
|
|
|
|
|
|
The entrypoint to running a sandboxed container is the `runsc` executable.
|
|
|
|
`runsc` implements the [Open Container Initiative (OCI)][oci] runtime
|
|
|
|
specification. This means that OCI compatible _filesystem bundles_ can be run by
|
2020-05-12 19:55:23 +00:00
|
|
|
`runsc`. Filesystem bundles are comprised of a `config.json` file containing
|
|
|
|
container configuration, and a root filesystem for the container. Please see the
|
|
|
|
[OCI runtime spec][runtime-spec] for more information on filesystem bundles.
|
2019-03-30 02:40:11 +00:00
|
|
|
`runsc` implements multiple commands that perform various functions such as
|
|
|
|
starting, stopping, listing, and querying the status of containers.
|
|
|
|
|
2019-04-02 17:26:01 +00:00
|
|
|
## Sentry
|
2019-03-30 02:40:11 +00:00
|
|
|
|
|
|
|
The Sentry is the largest component of gVisor. It can be thought of as a
|
|
|
|
userspace OS kernel. The Sentry implements all the kernel functionality needed
|
|
|
|
by the untrusted application. It implements all of the supported system calls,
|
2020-05-12 19:55:23 +00:00
|
|
|
signal delivery, memory management and page faulting logic, the threading model,
|
|
|
|
and more.
|
2019-03-30 02:40:11 +00:00
|
|
|
|
|
|
|
When the untrusted application makes a system call, the currently used platform
|
2019-04-02 17:26:01 +00:00
|
|
|
redirects the call to the Sentry, which will do the necessary work to service
|
|
|
|
it. It is important to note that the Sentry will not simply pass through system
|
|
|
|
calls to the host kernel. As a userspace application, the Sentry will make some
|
|
|
|
host system calls to support its operation, but it will not allow the
|
2019-03-30 02:40:11 +00:00
|
|
|
application to directly control the system calls it makes.
|
|
|
|
|
|
|
|
The Sentry aims to present an equivalent environment to (upstream) Linux v4.4.
|
|
|
|
|
2020-05-12 19:55:23 +00:00
|
|
|
File system operations that extend beyond the sandbox (not internal /proc files,
|
|
|
|
pipes, etc) are sent to the Gofer, described below.
|
2019-03-30 02:40:11 +00:00
|
|
|
|
|
|
|
## Platforms
|
|
|
|
|
2019-04-02 17:26:01 +00:00
|
|
|
gVisor requires a platform to implement interception of syscalls, basic context
|
2019-03-30 02:40:11 +00:00
|
|
|
switching, and memory mapping functionality.
|
|
|
|
|
|
|
|
### ptrace
|
|
|
|
|
2019-04-02 17:26:01 +00:00
|
|
|
The ptrace platform uses `PTRACE_SYSEMU` to execute user code without allowing
|
|
|
|
it to execute host system calls. This platform can run anywhere that ptrace
|
|
|
|
works (even VMs without nested virtualization).
|
2019-03-30 02:40:11 +00:00
|
|
|
|
|
|
|
### KVM (experimental)
|
|
|
|
|
|
|
|
The KVM platform allows the Sentry to act as both guest OS and VMM, switching
|
|
|
|
back and forth between the two worlds seamlessly. The KVM platform can run on
|
2019-04-02 17:26:01 +00:00
|
|
|
bare-metal or in a VM with nested virtualization enabled. While there is no
|
2019-03-30 02:40:11 +00:00
|
|
|
virtualized hardware layer -- the sandbox retains a process model -- gVisor
|
|
|
|
leverages virtualization extensions available on modern processors in order to
|
|
|
|
improve isolation and performance of address space switches.
|
|
|
|
|
|
|
|
## Gofer
|
|
|
|
|
|
|
|
The Gofer is a normal host Linux process. The Gofer is started with each sandbox
|
|
|
|
and connected to the Sentry. The Sentry process is started in a restricted
|
|
|
|
seccomp container without access to file system resources. The Gofer provides
|
2019-04-02 17:26:01 +00:00
|
|
|
the Sentry access to file system resources via the 9P protocol and provides an
|
|
|
|
additional level of isolation.
|
2019-03-30 02:40:11 +00:00
|
|
|
|
|
|
|
## Application
|
|
|
|
|
2019-04-02 17:26:01 +00:00
|
|
|
The application (aka the untrusted application) is a normal Linux binary
|
2019-03-30 02:40:11 +00:00
|
|
|
provided to gVisor in an OCI runtime bundle. gVisor aims to provide an
|
|
|
|
environment equivalent to Linux v4.4, so applications should be able to run
|
|
|
|
unmodified. However, gVisor does not presently implement every system call,
|
|
|
|
/proc file, or /sys file so some incompatibilities may occur.
|
|
|
|
|
|
|
|
[oci]: https://www.opencontainers.org
|
|
|
|
[runtime-spec]: https://github.com/opencontainers/runtime-spec
|