Commit Graph

27 Commits

Author SHA1 Message Date
Kevin Krakauer 635b0c4593 runsc fsgofer: Support dynamic serving of filesystems.
When multiple containers run inside a sentry, each container has its own root
filesystem and set of mounts. Containers are also added after sentry boot rather
than all configured and known at boot time.

The fsgofer needs to be able to serve the root filesystem of each container.
Thus, it must be possible to add filesystems after the fsgofer has already
started.

This change:
* Creates a URPC endpoint within the gofer process that listens for requests to
  serve new content.
* Enables the sentry, when starting a new container, to add the new container's
  filesystem.
* Mounts those new filesystems at separate roots within the sentry.

PiperOrigin-RevId: 208903248
Change-Id: Ifa91ec9c8caf5f2f0a9eead83c4a57090ce92068
2018-08-15 16:25:22 -07:00
Fabricio Voznika 0d350aac7f Enable SACK in runsc
SACK is disabled by default and needs to be manually enabled. It not only
improves performance, but also fixes hangs downloading files from certain
websites.

PiperOrigin-RevId: 207906742
Change-Id: I4fb7277b67bfdf83ac8195f1b9c38265a0d51e8b
2018-08-08 10:26:18 -07:00
Ian Gudger 3cd7824410 Move stack clock to options struct
PiperOrigin-RevId: 207039273
Change-Id: Ib8f55a6dc302052ab4a10ccd70b07f0d73b373df
2018-08-01 20:22:02 -07:00
Justine Olshan c05660373e Moved restore code out of create and made to be called after create.
Docker expects containers to be created before they are restored.
However, gVisor restoring requires specificactions regarding the kernel
and the file system. These actions were originally in booting the sandbox.

Now setting up the file system is deferred until a call to a call to
runsc start. In the restore case, the kernel is destroyed and a new kernel
is created in the same process, as we need the same process for Docker.

These changes required careful execution of concurrent processes which
required the use of a channel.

Full docker integration still needs the ability to restore into the same
container.

PiperOrigin-RevId: 205161441
Change-Id: Ie1d2304ead7e06855319d5dc310678f701bd099f
2018-07-18 16:58:30 -07:00
Nicolas Lacasse 9059983fdb runsc: Fix map access race in boot.Loader.waitContainer.
PiperOrigin-RevId: 204522004
Change-Id: I4819dc025f0a1df03ceaaba7951b1902d44562b3
2018-07-13 13:46:14 -07:00
Nicolas Lacasse 67507bd579 runsc: Don't close the control server in a defer.
Closing the control server will block until all open requests have completed.
If a control server method panics, we end up stuck because the defer'd Destroy
function will never return.

PiperOrigin-RevId: 204354676
Change-Id: I6bb1d84b31242d7c3f20d5334b1c966bd6a61dbf
2018-07-12 13:36:57 -07:00
Michael Pratt 660f1203ff Fix runsc VDSO mapping
80bdf8a406 accidentally moved vdso into an
inner scope, never assigning the vdso variable passed to the Kernel and
thus skipping VDSO mappings.

Fix this and remove the ability for loadVDSO to skip VDSO mappings,
since tests that do so are gone.

PiperOrigin-RevId: 203169135
Change-Id: Ifd8cadcbaf82f959223c501edcc4d83d05327eba
2018-07-03 12:53:39 -07:00
Justine Olshan 80bdf8a406 Sets the restore environment for restoring a container.
Updated how restoring occurs through boot.go with a separate Restore function.
This prevents a new process and new mounts from being created.
Added tests to ensure the container is restored.
Registered checkpoint and restore commands so they can be used.
Docker support for these commands is still limited.
Working on #80.

PiperOrigin-RevId: 202710950
Change-Id: I2b893ceaef6b9442b1ce3743bd112383cb92af0c
2018-06-29 14:47:40 -07:00
Kevin Krakauer 16d37973eb runsc: Add the "wait" subcommand.
Users can now call "runsc wait <container id>" to wait on a particular process
inside the container. -pid can also be used to wait on a specific PID.

Manually tested the wait subcommand for a single waiter and multiple waiters
(simultaneously 2 processes waiting on the container and 2 processes waiting on
a PID within the container).

PiperOrigin-RevId: 202548978
Change-Id: Idd507c2cdea613c3a14879b51cfb0f7ea3fb3d4c
2018-06-28 14:56:36 -07:00
Fabricio Voznika 8459390cdd Error out if spec is invalid
Closes #66

PiperOrigin-RevId: 202496258
Change-Id: Ib9287c5bf1279ffba1db21ebd9e6b59305cddf34
2018-06-28 09:57:27 -07:00
Fabricio Voznika 1f207de315 Add option to configure watchdog action
PiperOrigin-RevId: 202494747
Change-Id: I4d4a18e71468690b785060e580a5f83c616bd90f
2018-06-28 09:46:50 -07:00
Lantao Liu 000fd8d1e4 runsc: set gofer umask to 0.
PiperOrigin-RevId: 202185642
Change-Id: I2eefcc0b2ffadc6ef21d177a8a4ab0cda91f3399
2018-06-26 13:40:04 -07:00
Kevin Krakauer 04bdcc7b65 runsc: Enable waiting on individual containers within a sandbox.
PiperOrigin-RevId: 201742160
Change-Id: Ia9fa1442287c5f9e1196fb117c41536a80f6bb31
2018-06-22 14:31:25 -07:00
Fabricio Voznika f6be5fe619 Forward SIGUSR2 to the sandbox too
SIGUSR2 was being masked out to be used as a way to dump sentry
stacks. This could cause compatibility problems in cases anyone
uses SIGUSR2 to communicate with the container init process.

PiperOrigin-RevId: 201575374
Change-Id: I312246e828f38ad059139bb45b8addc2ed055d74
2018-06-21 13:22:18 -07:00
Nicolas Lacasse 81d13fbd4d runsc: Default umask should be 0.
PiperOrigin-RevId: 201539050
Change-Id: I36cbf270fa5ad25de507ecb919e4005eda6aa16d
2018-06-21 09:43:15 -07:00
Kevin Krakauer 5397963b5d runsc: Enable container creation within existing sandboxes.
Containers are created as processes in the sandbox. Of the many things that
don't work yet, the biggest issue is that the fsgofer is launched with its root
as the sandbox's root directory. Thus, when a container is started and wants to
read anything (including the init binary of the container), the gofer tries to
serve from sandbox's root (which basically just has pause), not the container's.

PiperOrigin-RevId: 201294560
Change-Id: I6423aa8830538959c56ae908ce067e4199d627b1
2018-06-19 21:44:33 -07:00
Justine Olshan 873ec0c414 Modified boot.go to allow for restores.
A file descriptor was added as a flag to boot so a state file can restore a
container that was checkpointed.

PiperOrigin-RevId: 201068699
Change-Id: I18e96069488ffa3add468861397f3877725544aa
2018-06-18 15:20:36 -07:00
Fabricio Voznika ef5dd4df9b Set kernel.applicationCores to the number of processor on the host
The right number to use is the number of processors assigned to the cgroup. But until
we make the sandbox join the respective cgroup, just use the number of processors on
the host.

Closes #65, closes #66

PiperOrigin-RevId: 200725483
Change-Id: I34a566b1a872e26c66f56fa6e3100f42aaf802b1
2018-06-15 09:19:04 -07:00
Brielle Broder 711a9869e5 Runsc checkpoint works.
This is the first iteration of checkpoint that actually saves to a file.
Tests for checkpoint are included.

Ran into an issue when private unix sockets are enabled. An error message
was added for this case and the mutex state was set.

PiperOrigin-RevId: 200269470
Change-Id: I28d29a9f92c44bf73dc4a4b12ae0509ee4070e93
2018-06-12 13:25:23 -07:00
Googler 722275c3d1 Added a function to the controller to checkpoint a container.
Functionality for checkpoint is not complete, more to come.

PiperOrigin-RevId: 199500803
Change-Id: Iafb0fcde68c584270000fea898e6657a592466f7
2018-06-06 11:43:55 -07:00
Fabricio Voznika e48f707876 Configure sandbox as superuser
Container user might not have enough priviledge to walk directories and
mount filesystems. Instead, create superuser to perform these steps of
the configuration.

PiperOrigin-RevId: 197953667
Change-Id: I643650ab654e665408e2af1b8e2f2aa12d58d4fb
2018-05-24 14:27:57 -07:00
Rahat Mahmood 8878a66a56 Implement sysv shm.
PiperOrigin-RevId: 197058289
Change-Id: I3946c25028b7e032be4894d61acb48ac0c24d574
2018-05-17 15:06:19 -07:00
Nicolas Lacasse 31386185fe Push signal-delivery and wait into the sandbox.
This is another step towards multi-container support.

Previously, we delivered signals directly to the sandbox process (which then
forwarded the signal to PID 1 inside the sandbox). Similarly, we waited on a
container by waiting on the sandbox process itself. This approach will not work
when there are multiple containers inside the sandbox, and we need to
signal/wait on individual containers.

This CL adds two new messages, ContainerSignal and ContainerWait. These
messages include the id of the container to signal/wait. The controller inside
the sandbox receives these messages and signals/waits on the appropriate
process inside the sandbox.

The container id is plumbed into the sandbox, but it currently is not used. We
still end up signaling/waiting on PID 1 in all cases.  Once we actually have
multiple containers inside the sandbox, we will need to keep some sort of map
of container id -> pid (or possibly pid namespace), and signal/kill the
appropriate process for the container.

PiperOrigin-RevId: 197028366
Change-Id: I07b4d5dc91ecd2affc1447e6b4bdd6b0b7360895
2018-05-17 11:55:28 -07:00
Nicolas Lacasse 1bdec86bae Return better errors from Docker when runsc fails to start.
Two changes in this CL:

First, make the "boot" process sleep when it encounters an error to give the
controller time to send the error back to the "start" process. Otherwise the
"boot" process exits immediately and the control connection errors with EOF.

Secondly, open the log file with O_APPEND, not O_TRUNC. Docker uses the same
log file for all runtime commands, and setting O_TRUNC causes them to get
destroyed. Furthermore, containerd parses these log files in the event of an
error, and it does not like the file being truncated out from underneath it.

Now, when trying to run a binary that does not exist in the image, the error
message is more reasonable:

$ docker run alpine /not/found
docker: Error response from daemon: OCI runtime start failed: /usr/local/google/docker/runtimes/runscd did not terminate sucessfully: error starting sandbox: error starting application [/not/found]: failed to create init process: no such file or directory

Fixes #32

PiperOrigin-RevId: 196027084
Change-Id: Iabc24c0bdd8fc327237acc051a1655515f445e68
2018-05-09 14:13:37 -07:00
Ian Gudger eb5414ee29 Add support for ping sockets
PiperOrigin-RevId: 195049322
Change-Id: I09f6dd58cf10a2e50e53d17d2823d540102913c5
2018-05-01 22:51:41 -07:00
Ian Gudger 3d3deef573 Implement SO_TIMESTAMP
PiperOrigin-RevId: 195047018
Change-Id: I6d99528a00a2125f414e1e51e067205289ec9d3d
2018-05-01 22:11:49 -07:00
Googler d02b74a5dc Check in gVisor.
PiperOrigin-RevId: 194583126
Change-Id: Ica1d8821a90f74e7e745962d71801c598c652463
2018-04-28 01:44:26 -04:00