gvisor

Commit Graph

Author	SHA1	Message	Date
Lingfu	f0a92b6b67	Add docker command line args support for --cpuset-cpus and --cpus `docker run --cpuset-cpus=/--cpus=` will generate cpu resource info in config.json (runtime spec file). When nginx worker_connections is configured as auto, the worker is generated according to the number of CPUs. If the cgroup is already set on the host, but it is not displayed correctly in the sandbox, performance may be degraded. This patch can get cpus info from spec file and apply to sentry on bootup, so the /proc/cpuinfo can show the correct cpu numbers. `lscpu` and other commands rely on `/sys/devices/system/cpu/online` are also affected by this patch. e.g. --cpuset-cpus=2,3 -> cpu number:2 --cpuset-cpus=4-7 -> cpu number:4 --cpus=2.8 -> cpu number:3 --cpus=0.5 -> cpu number:1 Change-Id: Ideb22e125758d4322a12be7c51795f8018e3d316 PiperOrigin-RevId: 213685199	2018-09-19 13:35:42 -07:00
Bhasker Hariharan	bd12e95247	Fix RTT estimation when timestamp option is enabled. From RFC7323#Section-4 The [RFC6298] RTT estimator has weighting factors, alpha and beta, based on an implicit assumption that at most one RTTM will be sampled per RTT. When multiple RTTMs per RTT are available to update the RTT estimator, an implementation SHOULD try to adhere to the spirit of the history specified in [RFC6298]. An implementation suggestion is detailed in Appendix G. From RFC7323#appendix-G Appendix G. RTO Calculation Modification Taking multiple RTT samples per window would shorten the history calculated by the RTO mechanism in [RFC6298], and the below algorithm aims to maintain a similar history as originally intended by [RFC6298]. It is roughly known how many samples a congestion window worth of data will yield, not accounting for ACK compression, and ACK losses. Such events will result in more history of the path being reflected in the final value for RTO, and are uncritical. This modification will ensure that a similar amount of time is taken into account for the RTO estimation, regardless of how many samples are taken per window: ExpectedSamples = ceiling(FlightSize / (SMSS * 2)) alpha' = alpha / ExpectedSamples beta' = beta / ExpectedSamples Note that the factor 2 in ExpectedSamples is due to "Delayed ACKs". Instead of using alpha and beta in the algorithm of [RFC6298], use alpha' and beta' instead: RTTVAR <- (1 - beta') * RTTVAR + beta' * \|SRTT - R'\| SRTT <- (1 - alpha') * SRTT + alpha' * R' (for each sample R') PiperOrigin-RevId: 213644795 Change-Id: I52278b703540408938a8edb8c38be97b37f4a10e	2018-09-19 09:59:12 -07:00
Fabricio Voznika	8aec7473a1	Added state machine checks for Container.Status For my own sanitity when thinking about possible transitions and state. PiperOrigin-RevId: 213559482 Change-Id: I25588c86cf6098be4eda01f4e7321c102ceef33c	2018-09-18 19:12:54 -07:00
Nicolas Lacasse	fd222d62ed	Short-circuit Readdir calls on overlay files when the dirent is frozen. If we have an overlay file whose corresponding Dirent is frozen, then we should not bother calling Readdir on the upper or lower files, since DirentReaddir will calculate children based on the frozen Dirent tree. A test was added that fails without this change. PiperOrigin-RevId: 213531215 Change-Id: I4d6c98f1416541a476a34418f664ba58f936a81d	2018-09-18 15:42:22 -07:00
Fabricio Voznika	7967d8ecd5	Handle children processes better in tests Reap children more systematically in container tests. Previously, container_test was taking ~5 mins to run because constainer.Destroy() would timeout waiting for the sandbox process to exit. Now the test running in less than a minute. Also made the contract around Container and Sandbox destroy clearer. PiperOrigin-RevId: 213527471 Change-Id: Icca84ee1212bbdcb62bdfc9cc7b71b12c6d1688d	2018-09-18 15:21:28 -07:00
Michael Pratt	dd05c96d99	Increase state test timeout PiperOrigin-RevId: 213519378 Change-Id: Iffdb987da3a7209a297ea2df171d2ae5fa9b2b34	2018-09-18 14:38:42 -07:00
Kevin Krakauer	7e00f37054	Automated rollback of changelist 213307171 PiperOrigin-RevId: 213504354 Change-Id: Iadd42f0ca4b7e7a9eae780bee9900c7233fb4f3f	2018-09-18 13:22:26 -07:00
Brian Geffon	ed08597d12	Allow for MSG_CTRUNC in input flags for recv. PiperOrigin-RevId: 213481363 Change-Id: I8150ea20cebeb207afe031ed146244de9209e745	2018-09-18 11:14:37 -07:00
Fabricio Voznika	da20559137	Provide better message when memfd_create fails with ENOSYS Updates #100 PiperOrigin-RevId: 213414821 Change-Id: I90c2e6c18c54a6afcd7ad6f409f670aa31577d37	2018-09-18 02:09:28 -07:00
Fabricio Voznika	5d9816be41	Remove memory usage static init panic() during init() can be hard to debug. Updates #100 PiperOrigin-RevId: 213391932 Change-Id: Ic103f1981c5b48f1e12da3b42e696e84ffac02a9	2018-09-17 21:34:37 -07:00
Fabricio Voznika	26b08e182c	Rename container in test 's' used to stand for sandbox, before container exited. PiperOrigin-RevId: 213390641 Change-Id: I7bda94a50398c46721baa92227e32a7a1d817412	2018-09-17 21:18:27 -07:00
Tamir Duberstein	d6409b6564	Prevent TCP connect from picking bound ports PiperOrigin-RevId: 213387851 Change-Id: Icc6850761bc11afd0525f34863acd77584155140	2018-09-17 20:44:04 -07:00
Kevin Krakauer	bb88c187c5	runsc: Enable waiting on exited processes. This makes `runsc wait` behave more like waitpid()/wait4() in that: - Once a process has run to completion, you can wait on it and get its exit code. - Processes not waited on will consume memory (like a zombie process) PiperOrigin-RevId: 213358916 Change-Id: I5b5eca41ce71eea68e447380df8c38361a4d1558	2018-09-17 16:25:24 -07:00
Ian Gudger	ab6fa44588	Allow kernel.(*Task).Block to accept an extract only channel PiperOrigin-RevId: 213328293 Change-Id: I4164133e6f709ecdb89ffbb5f7df3324c273860a	2018-09-17 13:35:54 -07:00
Tamir Duberstein	a452971630	Add empty .s file to allow `//go:linkname` This was previously broken in 212917409, resulting in "missing function body" compilation errors. PiperOrigin-RevId: 213323695 Change-Id: I32a95b76a1c73fd731f223062ec022318b979bd4	2018-09-17 13:06:55 -07:00
Tamir Duberstein	23258ca284	Implement packet forwarding to enable NAT PiperOrigin-RevId: 213323501 Change-Id: I0996ddbdcf097588745efe35481085d42dbaf446	2018-09-17 13:05:36 -07:00
Michael Pratt	d639c3d61b	Allow NULL data in mount(2) PiperOrigin-RevId: 213315267 Change-Id: I7562bcd81fb22e90aa9c7dd9eeb94803fcb8c5af	2018-09-17 12:16:29 -07:00
Kevin Krakauer	25add7b22b	runsc: Fix stdin/out/err in multi-container mode. Stdin/out/err weren't being sent to the sentry. PiperOrigin-RevId: 213307171 Change-Id: Ie4b634a58b1b69aa934ce8597e5cc7a47a2bcda2	2018-09-17 11:31:28 -07:00
newmanwang	de5a590ee2	Avoid reuse of pending SignalInfo objects runApp.execute -> Task.SendSignal -> sendSignalLocked -> sendSignalTimerLocked -> pendingSignals.enqueue assumes that it owns the arch.SignalInfo returned from platform.Context.Switch. On the other hand, ptrace.context.Switch assumes that it owns the returned SignalInfo and can safely reuse it on the next call to Switch. The KVM platform always returns a unique SignalInfo. This becomes a problem when the returned signal is not immediately delivered, allowing a future signal in Switch to change the previous pending SignalInfo. This is noticeable in #38 when external SIGINTs are delivered from the PTY slave FD. Note that the ptrace stubs are in the same process group as the sentry, so they are eligible to receive the PTY signals. This should probably change, but is not the only possible cause of this bug. Updates #38 Original change by newmanwang <wcs1011@gmail.com>, updated by Michael Pratt <mpratt@google.com>. Change-Id: I5383840272309df70a29f67b25e8221f933622cd PiperOrigin-RevId: 213071072	2018-09-14 17:39:25 -07:00
Tamir Duberstein	75c66f871b	Remove buffer.Prependable.UsedBytes It is the same as buffer.Prependable.View. PiperOrigin-RevId: 213064166 Change-Id: Ib33b8a2c4da864209d9a0be0a1c113be10b520d3	2018-09-14 16:39:56 -07:00
Michael Pratt	3aa50f18a4	Reuse readlink parameter, add sockaddr max. PiperOrigin-RevId: 213058623 Change-Id: I522598c655d633b9330990951ff1c54d1023ec29	2018-09-14 16:00:02 -07:00
Tamir Duberstein	d7a05b4e63	Pass buffer.Prependable by value PiperOrigin-RevId: 213053370 Change-Id: I60ea89572b4fca53fd126c870fcbde74fcf52562	2018-09-14 15:23:58 -07:00
Nicolas Lacasse	b84bfa570d	Make gVisor hard link check match Linux's. Linux permits hard-linking if the target is owned by the user OR the target has Read+Write permission. PiperOrigin-RevId: 213024613 Change-Id: If642066317b568b99084edd33ee4e8822ec9cbb3	2018-09-14 12:29:46 -07:00
Jamie Liu	0380bcb3a4	Fix interaction between rt_sigtimedwait and ignored signals. PiperOrigin-RevId: 213011782 Change-Id: I716c6ea3c586b0c6c5a892b6390d2d11478bc5af	2018-09-14 11:10:50 -07:00
Chenggang	faa34a0738	platform/kvm: Get max vcpu number dynamically by ioctl The old kernel version, such as 4.4, only support 255 vcpus. While gvisor is ran on these kernels, it could panic because the vcpu id and vcpu number beyond max_vcpus. Use ioctl(vmfd, _KVM_CHECK_EXTENSION, _KVM_CAP_MAX_VCPUS) to get max vcpus number dynamically. Change-Id: I50dd859a11b1c2cea854a8e27d4bf11a411aa45c PiperOrigin-RevId: 212929704	2018-09-13 21:47:11 -07:00
Ian Gudger	29a7271f5d	Plumb monotonic time to netstack Netstack needs to be portable, so this seems to be preferable to using raw system calls. PiperOrigin-RevId: 212917409 Change-Id: I7b2073e7db4b4bf75300717ca23aea4c15be944c	2018-09-13 19:12:15 -07:00
Lantao Liu	bde2a91433	runsc: Support container signal/wait. This CL: 1) Fix `runsc wait`, it now also works after the container exits; 2) Generate correct container state in Load; 2) Make sure `Destory` cleanup everything before successfully return. PiperOrigin-RevId: 212900107 Change-Id: Ie129cbb9d74f8151a18364f1fc0b2603eac4109a	2018-09-13 16:38:03 -07:00
Rahat Mahmood	adf8f33970	Extend memory usage events to report mapped memory usage. PiperOrigin-RevId: 212887555 Change-Id: I3545383ce903cbe9f00d9b5288d9ef9a049b9f4f	2018-09-13 15:16:47 -07:00
Michael Pratt	9c6b38e295	Format struct itimerspec PiperOrigin-RevId: 212874745 Change-Id: I0c3e8e6a9e8976631cee03bf0b8891b336ddb8c8	2018-09-13 14:07:47 -07:00
Nicolas Lacasse	e2d79480f5	initArgs must hold a reference on the Root if it is not nil. The contract in ExecArgs says that a reference on ExecArgs.Root must be held for the lifetime of the struct, but the caller is free to drop the ref after that. As a result, proc.Exec must take an additional ref on Root when it constructs the CreateProcessArgs, since that holds a pointer to Root as well. That ref is dropped in CreateProcess. PiperOrigin-RevId: 212828348 Change-Id: I7f44a612f337ff51a02b873b8a845d3119408707	2018-09-13 09:50:35 -07:00
Tamir Duberstein	d689f8422f	Always pass buffer.VectorisedView by value PiperOrigin-RevId: 212757571 Change-Id: I04200df9e45c21eb64951cd2802532fa84afcb1a	2018-09-12 21:57:55 -07:00
Tamir Duberstein	5adb3468d4	Add multicast support PiperOrigin-RevId: 212750821 Change-Id: I822fd63e48c684b45fd91f9ce057867b7eceb792	2018-09-12 20:39:24 -07:00
Zhaozhong Ni	9dec7a3db9	compressio: stop worker-pool reference / dependency loop. PiperOrigin-RevId: 212732300 Change-Id: I9a0b9b7c28e7b7439d34656dd4f2f6114d173e22	2018-09-12 17:24:53 -07:00
Kevin Krakauer	2eff1fdd06	runsc: Add exec flag that specifies where to save the sandbox-internal pid. This is different from the existing -pid-file flag, which saves a host pid. PiperOrigin-RevId: 212713968 Change-Id: I2c486de8dd5cfd9b923fb0970165ef7c5fc597f0	2018-09-12 15:23:35 -07:00
Michael Pratt	0efde2bfbd	Remove getdents from filters It was only used by whitelistfs, which was removed in `bc81f3fe4a`. PiperOrigin-RevId: 212666374 Change-Id: Ia35e6dc9d68c1a3b015d5b5f71ea3e68e46c5bed	2018-09-12 10:51:25 -07:00
Tamir Duberstein	cbf3980464	Prevent UDP sockets from binding to bound ports PiperOrigin-RevId: 212653818 Change-Id: Ib4e1d754d9cdddeaa428a066cb675e6ec44d91ad	2018-09-12 09:39:01 -07:00
Michael Pratt	b4aed01bf2	Rollback of changelist 212483372 PiperOrigin-RevId: 212557844 Change-Id: I414de848e75d57ecee2c05e851d05b607db4aa57	2018-09-11 17:54:50 -07:00
Nicolas Lacasse	6cc9b311af	platform: Pass device fd into platform constructor. We were previously openining the platform device (i.e. /dev/kvm) inside the platfrom constructor (i.e. kvm.New). This requires that we have RW access to the platform device when constructing the platform. However, now that the runsc sandbox process runs as user "nobody", it is not able to open the platform device. This CL changes the kvm constructor to take the platform device FD, rather than opening the device file itself. The device file is opened outside of the sandbox and passed to the sandbox process. PiperOrigin-RevId: 212505804 Change-Id: I427e1d9de5eb84c84f19d513356e1bb148a52910	2018-09-11 13:09:46 -07:00
Fabricio Voznika	c44bc6612f	Allow fstatat back in syscall filters PiperOrigin-RevId: 212483372 Change-Id: If95f32a8e41126cf3dc8bd6c8b2fb0fcfefedc6d	2018-09-11 11:05:09 -07:00
Jamie Liu	a29c39aa62	Map committed chunks concurrently in FileMem.LoadFrom. PiperOrigin-RevId: 212345401 Change-Id: Iac626ee87ba312df88ab1019ade6ecd62c04c75c	2018-09-10 15:23:44 -07:00
Fabricio Voznika	7e9e6745ca	Allow '/dev/zero' to be mapped with unaligned length PiperOrigin-RevId: 212321271 Change-Id: I79d71c2e6f4b8fcd3b9b923fe96c2256755f4c48	2018-09-10 13:24:55 -07:00
Bert Muthalaly	da9ecb748c	Simplify some code in VectorisedView#ToView. PiperOrigin-RevId: 212317717 Change-Id: Ic77449c53bf2f8be92c9f0a7a726c45bd35ec435	2018-09-10 13:04:06 -07:00
Nicolas Lacasse	e198f9ab02	runsc: Chmod all mounted files to 777 inside chroot. Inside the chroot, we run as user nobody, so all mounted files and directories must be accessible to all users. PiperOrigin-RevId: 212284805 Change-Id: I705e0dbbf15e01e04e0c7f378a99daffe6866807	2018-09-10 10:00:16 -07:00
Nicolas Lacasse	0c0c942327	Automated rollback of changelist 212059579 PiperOrigin-RevId: 212069131 Change-Id: I01476f957bbf29d4ee5a3c11d59d4f863ba9f2df	2018-09-07 18:23:27 -07:00
Michael Pratt	7045828a31	Update cleanup TODO PiperOrigin-RevId: 212068327 Change-Id: I3f360cdf7d6caa1c96fae68ae3a1caaf440f0cbe	2018-09-07 18:14:57 -07:00
Nicolas Lacasse	922d8c3c8c	Automated rollback of changelist 211992321 PiperOrigin-RevId: 212066419 Change-Id: Icded56e7e117bfd9b644e6541bddcd110460a9b8	2018-09-07 17:56:07 -07:00
Nicolas Lacasse	9751b800a6	runsc: Support multi-container exec. We must use a context.Context with a Root Dirent that corresponds to the container's chroot. Previously we were using the root context, which does not have a chroot. Getting the correct context required refactoring some of the path-lookup code. We can't lookup the path without a context.Context, which requires kernel.CreateProcArgs, which we only get inside control.Execute. So we have to do the path lookup much later than we previously were. PiperOrigin-RevId: 212064734 Change-Id: I84a5cfadacb21fd9c3ab9c393f7e308a40b9b537	2018-09-07 17:39:54 -07:00
Fabricio Voznika	cf5006ff24	Disable test until we figure out what's broken PiperOrigin-RevId: 212059579 Change-Id: I052c2192d3483d7bd0fd2232ef2023a12da66446	2018-09-07 17:00:41 -07:00
Fabricio Voznika	172860a059	Add 'Starting gVisor...' message to syslog This allows applications to verify they are running with gVisor. It also helps debugging when running with a mix of container runtimes. Closes #54 PiperOrigin-RevId: 212059457 Change-Id: I51d9595ee742b58c1f83f3902ab2e2ecbd5cedec	2018-09-07 16:59:27 -07:00
Adin Scannell	6cfb5cd56d	Add additional sanity checks for walk. PiperOrigin-RevId: 212058684 Change-Id: I319709b9ffcfccb3231bac98df345d2a20eca24b	2018-09-07 16:53:12 -07:00

... 3 4 5 6 7 ...

744 Commits All Branches Search

744 Commits

All Branches