Commit Graph

789 Commits

Author SHA1 Message Date
Michael Pratt dd05c96d99 Increase state test timeout
PiperOrigin-RevId: 213519378
Change-Id: Iffdb987da3a7209a297ea2df171d2ae5fa9b2b34
2018-09-18 14:38:42 -07:00
Kevin Krakauer 7e00f37054 Automated rollback of changelist 213307171
PiperOrigin-RevId: 213504354
Change-Id: Iadd42f0ca4b7e7a9eae780bee9900c7233fb4f3f
2018-09-18 13:22:26 -07:00
Brian Geffon ed08597d12 Allow for MSG_CTRUNC in input flags for recv.
PiperOrigin-RevId: 213481363
Change-Id: I8150ea20cebeb207afe031ed146244de9209e745
2018-09-18 11:14:37 -07:00
Fabricio Voznika da20559137 Provide better message when memfd_create fails with ENOSYS
Updates #100

PiperOrigin-RevId: 213414821
Change-Id: I90c2e6c18c54a6afcd7ad6f409f670aa31577d37
2018-09-18 02:09:28 -07:00
Fabricio Voznika 5d9816be41 Remove memory usage static init
panic() during init() can be hard to debug.

Updates #100

PiperOrigin-RevId: 213391932
Change-Id: Ic103f1981c5b48f1e12da3b42e696e84ffac02a9
2018-09-17 21:34:37 -07:00
Fabricio Voznika 26b08e182c Rename container in test
's' used to stand for sandbox, before container exited.

PiperOrigin-RevId: 213390641
Change-Id: I7bda94a50398c46721baa92227e32a7a1d817412
2018-09-17 21:18:27 -07:00
Tamir Duberstein d6409b6564 Prevent TCP connect from picking bound ports
PiperOrigin-RevId: 213387851
Change-Id: Icc6850761bc11afd0525f34863acd77584155140
2018-09-17 20:44:04 -07:00
Kevin Krakauer bb88c187c5 runsc: Enable waiting on exited processes.
This makes `runsc wait` behave more like waitpid()/wait4() in that:
- Once a process has run to completion, you can wait on it and get its exit
  code.
- Processes not waited on will consume memory (like a zombie process)

PiperOrigin-RevId: 213358916
Change-Id: I5b5eca41ce71eea68e447380df8c38361a4d1558
2018-09-17 16:25:24 -07:00
Ian Gudger ab6fa44588 Allow kernel.(*Task).Block to accept an extract only channel
PiperOrigin-RevId: 213328293
Change-Id: I4164133e6f709ecdb89ffbb5f7df3324c273860a
2018-09-17 13:35:54 -07:00
Tamir Duberstein a452971630 Add empty .s file to allow `//go:linkname`
This was previously broken in 212917409, resulting in "missing function body"
compilation errors.

PiperOrigin-RevId: 213323695
Change-Id: I32a95b76a1c73fd731f223062ec022318b979bd4
2018-09-17 13:06:55 -07:00
Tamir Duberstein 23258ca284 Implement packet forwarding to enable NAT
PiperOrigin-RevId: 213323501
Change-Id: I0996ddbdcf097588745efe35481085d42dbaf446
2018-09-17 13:05:36 -07:00
Michael Pratt d639c3d61b Allow NULL data in mount(2)
PiperOrigin-RevId: 213315267
Change-Id: I7562bcd81fb22e90aa9c7dd9eeb94803fcb8c5af
2018-09-17 12:16:29 -07:00
Kevin Krakauer 25add7b22b runsc: Fix stdin/out/err in multi-container mode.
Stdin/out/err weren't being sent to the sentry.

PiperOrigin-RevId: 213307171
Change-Id: Ie4b634a58b1b69aa934ce8597e5cc7a47a2bcda2
2018-09-17 11:31:28 -07:00
newmanwang de5a590ee2 Avoid reuse of pending SignalInfo objects
runApp.execute -> Task.SendSignal -> sendSignalLocked -> sendSignalTimerLocked
-> pendingSignals.enqueue assumes that it owns the arch.SignalInfo returned
from platform.Context.Switch.

On the other hand, ptrace.context.Switch assumes that it owns the returned
SignalInfo and can safely reuse it on the next call to Switch. The KVM platform
always returns a unique SignalInfo.

This becomes a problem when the returned signal is not immediately delivered,
allowing a future signal in Switch to change the previous pending SignalInfo.

This is noticeable in #38 when external SIGINTs are delivered from the PTY
slave FD. Note that the ptrace stubs are in the same process group as the
sentry, so they are eligible to receive the PTY signals. This should probably
change, but is not the only possible cause of this bug.

Updates #38

Original change by newmanwang <wcs1011@gmail.com>, updated by Michael Pratt
<mpratt@google.com>.

Change-Id: I5383840272309df70a29f67b25e8221f933622cd
PiperOrigin-RevId: 213071072
2018-09-14 17:39:25 -07:00
Tamir Duberstein 75c66f871b Remove buffer.Prependable.UsedBytes
It is the same as buffer.Prependable.View.

PiperOrigin-RevId: 213064166
Change-Id: Ib33b8a2c4da864209d9a0be0a1c113be10b520d3
2018-09-14 16:39:56 -07:00
Michael Pratt 3aa50f18a4 Reuse readlink parameter, add sockaddr max.
PiperOrigin-RevId: 213058623
Change-Id: I522598c655d633b9330990951ff1c54d1023ec29
2018-09-14 16:00:02 -07:00
Tamir Duberstein d7a05b4e63 Pass buffer.Prependable by value
PiperOrigin-RevId: 213053370
Change-Id: I60ea89572b4fca53fd126c870fcbde74fcf52562
2018-09-14 15:23:58 -07:00
Nicolas Lacasse b84bfa570d Make gVisor hard link check match Linux's.
Linux permits hard-linking if the target is owned by the user OR the target has
Read+Write permission.

PiperOrigin-RevId: 213024613
Change-Id: If642066317b568b99084edd33ee4e8822ec9cbb3
2018-09-14 12:29:46 -07:00
Jamie Liu 0380bcb3a4 Fix interaction between rt_sigtimedwait and ignored signals.
PiperOrigin-RevId: 213011782
Change-Id: I716c6ea3c586b0c6c5a892b6390d2d11478bc5af
2018-09-14 11:10:50 -07:00
Chenggang faa34a0738 platform/kvm: Get max vcpu number dynamically by ioctl
The old kernel version, such as 4.4, only support 255 vcpus.
While gvisor is ran on these kernels, it could panic because the
vcpu id and vcpu number beyond max_vcpus.
Use ioctl(vmfd, _KVM_CHECK_EXTENSION, _KVM_CAP_MAX_VCPUS) to get max
vcpus number dynamically.

Change-Id: I50dd859a11b1c2cea854a8e27d4bf11a411aa45c
PiperOrigin-RevId: 212929704
2018-09-13 21:47:11 -07:00
Ian Gudger 29a7271f5d Plumb monotonic time to netstack
Netstack needs to be portable, so this seems to be preferable to using raw
system calls.

PiperOrigin-RevId: 212917409
Change-Id: I7b2073e7db4b4bf75300717ca23aea4c15be944c
2018-09-13 19:12:15 -07:00
Lantao Liu bde2a91433 runsc: Support container signal/wait.
This CL:
1) Fix `runsc wait`, it now also works after the container exits;
2) Generate correct container state in Load;
2) Make sure `Destory` cleanup everything before successfully return.

PiperOrigin-RevId: 212900107
Change-Id: Ie129cbb9d74f8151a18364f1fc0b2603eac4109a
2018-09-13 16:38:03 -07:00
Rahat Mahmood adf8f33970 Extend memory usage events to report mapped memory usage.
PiperOrigin-RevId: 212887555
Change-Id: I3545383ce903cbe9f00d9b5288d9ef9a049b9f4f
2018-09-13 15:16:47 -07:00
Michael Pratt 9c6b38e295 Format struct itimerspec
PiperOrigin-RevId: 212874745
Change-Id: I0c3e8e6a9e8976631cee03bf0b8891b336ddb8c8
2018-09-13 14:07:47 -07:00
Nicolas Lacasse e2d79480f5 initArgs must hold a reference on the Root if it is not nil.
The contract in ExecArgs says that a reference on ExecArgs.Root must be held
for the lifetime of the struct, but the caller is free to drop the ref after
that.

As a result, proc.Exec must take an additional ref on Root when it constructs
the CreateProcessArgs, since that holds a pointer to Root as well. That ref is
dropped in CreateProcess.

PiperOrigin-RevId: 212828348
Change-Id: I7f44a612f337ff51a02b873b8a845d3119408707
2018-09-13 09:50:35 -07:00
Tamir Duberstein d689f8422f Always pass buffer.VectorisedView by value
PiperOrigin-RevId: 212757571
Change-Id: I04200df9e45c21eb64951cd2802532fa84afcb1a
2018-09-12 21:57:55 -07:00
Tamir Duberstein 5adb3468d4 Add multicast support
PiperOrigin-RevId: 212750821
Change-Id: I822fd63e48c684b45fd91f9ce057867b7eceb792
2018-09-12 20:39:24 -07:00
Zhaozhong Ni 9dec7a3db9 compressio: stop worker-pool reference / dependency loop.
PiperOrigin-RevId: 212732300
Change-Id: I9a0b9b7c28e7b7439d34656dd4f2f6114d173e22
2018-09-12 17:24:53 -07:00
Kevin Krakauer 2eff1fdd06 runsc: Add exec flag that specifies where to save the sandbox-internal pid.
This is different from the existing -pid-file flag, which saves a host pid.

PiperOrigin-RevId: 212713968
Change-Id: I2c486de8dd5cfd9b923fb0970165ef7c5fc597f0
2018-09-12 15:23:35 -07:00
Michael Pratt 0efde2bfbd Remove getdents from filters
It was only used by whitelistfs, which was removed in
bc81f3fe4a.

PiperOrigin-RevId: 212666374
Change-Id: Ia35e6dc9d68c1a3b015d5b5f71ea3e68e46c5bed
2018-09-12 10:51:25 -07:00
Tamir Duberstein cbf3980464 Prevent UDP sockets from binding to bound ports
PiperOrigin-RevId: 212653818
Change-Id: Ib4e1d754d9cdddeaa428a066cb675e6ec44d91ad
2018-09-12 09:39:01 -07:00
Michael Pratt b4aed01bf2 Rollback of changelist 212483372
PiperOrigin-RevId: 212557844
Change-Id: I414de848e75d57ecee2c05e851d05b607db4aa57
2018-09-11 17:54:50 -07:00
Nicolas Lacasse 6cc9b311af platform: Pass device fd into platform constructor.
We were previously openining the platform device (i.e. /dev/kvm) inside the
platfrom constructor (i.e. kvm.New).  This requires that we have RW access to
the platform device when constructing the platform.

However, now that the runsc sandbox process runs as user "nobody", it is not
able to open the platform device.

This CL changes the kvm constructor to take the platform device FD, rather than
opening the device file itself. The device file is opened outside of the
sandbox and passed to the sandbox process.

PiperOrigin-RevId: 212505804
Change-Id: I427e1d9de5eb84c84f19d513356e1bb148a52910
2018-09-11 13:09:46 -07:00
Fabricio Voznika c44bc6612f Allow fstatat back in syscall filters
PiperOrigin-RevId: 212483372
Change-Id: If95f32a8e41126cf3dc8bd6c8b2fb0fcfefedc6d
2018-09-11 11:05:09 -07:00
Jamie Liu a29c39aa62 Map committed chunks concurrently in FileMem.LoadFrom.
PiperOrigin-RevId: 212345401
Change-Id: Iac626ee87ba312df88ab1019ade6ecd62c04c75c
2018-09-10 15:23:44 -07:00
Fabricio Voznika 7e9e6745ca Allow '/dev/zero' to be mapped with unaligned length
PiperOrigin-RevId: 212321271
Change-Id: I79d71c2e6f4b8fcd3b9b923fe96c2256755f4c48
2018-09-10 13:24:55 -07:00
Bert Muthalaly da9ecb748c Simplify some code in VectorisedView#ToView.
PiperOrigin-RevId: 212317717
Change-Id: Ic77449c53bf2f8be92c9f0a7a726c45bd35ec435
2018-09-10 13:04:06 -07:00
Nicolas Lacasse e198f9ab02 runsc: Chmod all mounted files to 777 inside chroot.
Inside the chroot, we run as user nobody, so all mounted files and directories
must be accessible to all users.

PiperOrigin-RevId: 212284805
Change-Id: I705e0dbbf15e01e04e0c7f378a99daffe6866807
2018-09-10 10:00:16 -07:00
Nicolas Lacasse 0c0c942327 Automated rollback of changelist 212059579
PiperOrigin-RevId: 212069131
Change-Id: I01476f957bbf29d4ee5a3c11d59d4f863ba9f2df
2018-09-07 18:23:27 -07:00
Michael Pratt 7045828a31 Update cleanup TODO
PiperOrigin-RevId: 212068327
Change-Id: I3f360cdf7d6caa1c96fae68ae3a1caaf440f0cbe
2018-09-07 18:14:57 -07:00
Nicolas Lacasse 922d8c3c8c Automated rollback of changelist 211992321
PiperOrigin-RevId: 212066419
Change-Id: Icded56e7e117bfd9b644e6541bddcd110460a9b8
2018-09-07 17:56:07 -07:00
Nicolas Lacasse 9751b800a6 runsc: Support multi-container exec.
We must use a context.Context with a Root Dirent that corresponds to the
container's chroot. Previously we were using the root context, which does not
have a chroot.

Getting the correct context required refactoring some of the path-lookup code.
We can't lookup the path without a context.Context, which requires
kernel.CreateProcArgs, which we only get inside control.Execute.  So we have to
do the path lookup much later than we previously were.

PiperOrigin-RevId: 212064734
Change-Id: I84a5cfadacb21fd9c3ab9c393f7e308a40b9b537
2018-09-07 17:39:54 -07:00
Fabricio Voznika cf5006ff24 Disable test until we figure out what's broken
PiperOrigin-RevId: 212059579
Change-Id: I052c2192d3483d7bd0fd2232ef2023a12da66446
2018-09-07 17:00:41 -07:00
Fabricio Voznika 172860a059 Add 'Starting gVisor...' message to syslog
This allows applications to verify they are running with gVisor. It
also helps debugging when running with a mix of container runtimes.

Closes #54

PiperOrigin-RevId: 212059457
Change-Id: I51d9595ee742b58c1f83f3902ab2e2ecbd5cedec
2018-09-07 16:59:27 -07:00
Adin Scannell 6cfb5cd56d Add additional sanity checks for walk.
PiperOrigin-RevId: 212058684
Change-Id: I319709b9ffcfccb3231bac98df345d2a20eca24b
2018-09-07 16:53:12 -07:00
Fabricio Voznika 8ce3fbf9f8 Only start signal forwarding after init process is created
PiperOrigin-RevId: 212028121
Change-Id: If9c2c62f3be103e2bb556b8d154c169888e34369
2018-09-07 13:39:12 -07:00
Fabricio Voznika bc81f3fe4a Remove '--file-access=direct' option
It was used before gofer was implemented and it's not
supported anymore.
BREAKING CHANGE: proxy-shared and proxy-exclusive options
are now: shared and exclusive.

PiperOrigin-RevId: 212017643
Change-Id: If029d4073fe60583e5ca25f98abb2953de0d78fd
2018-09-07 12:28:48 -07:00
Fabricio Voznika f895cb4d8b Use root abstract socket namespace for exec
PiperOrigin-RevId: 211999211
Change-Id: I5968dd1a8313d3e49bb6e6614e130107495de41d
2018-09-07 10:45:55 -07:00
Michael Pratt 169e2efc5a Continue handling signals after disabling forwarding
Before destroying the Kernel, we disable signal forwarding,
relinquishing control to the Go runtime. External signals that arrive
after disabling forwarding but before the sandbox exits thus may use
runtime.raise (i.e., tkill(2)) and violate the syscall filters.

Adjust forwardSignals to handle signals received after disabling
forwarding the same way they are handled before starting forwarding.
i.e., by implementing the standard Go runtime behavior using tgkill(2)
instead of tkill(2).

This also makes the stop callback block until forwarding actually stops.
This isn't required to avoid tkill(2) but is a saner interface.

PiperOrigin-RevId: 211995946
Change-Id: I3585841644409260eec23435cf65681ad41f5f03
2018-09-07 10:28:25 -07:00
Nicolas Lacasse 210c252089 runsc: Run sandbox process inside minimal chroot.
We construct a dir with the executable bind-mounted at /exe, and proc mounted
at /proc.  Runsc now executes the sandbox process inside this chroot, thus
limiting access to the host filesystem.  The mounts and chroot dir are removed
when the sandbox is destroyed.

Because this requires bind-mounts, we can only do the chroot if we have
CAP_SYS_ADMIN.

PiperOrigin-RevId: 211994001
Change-Id: Ia71c515e26085e0b69b833e71691830148bc70d1
2018-09-07 10:16:39 -07:00