gvisor

Commit Graph

Author	SHA1	Message	Date
Adin Scannell	d811c1016d	ptrace: drop old FIXME The globalPool uses a sync.Once mechanism for initialization, and no cleanup is strictly required. It's not really feasible to have the platform implement a full creation -> destruction cycle (due to the way filters are assumed to be installed), so drop the FIXME. PiperOrigin-RevId: 236385278 Change-Id: I98ac660ed58cc688d8a07147d16074a3e8181314	2019-03-01 15:05:18 -08:00
Fabricio Voznika	3b44377eda	Fix "-c dbg" build break Remove allocation from vCPU.die() to save stack space. Closes #131 PiperOrigin-RevId: 236238102 Change-Id: Iafca27a1a3a472d4cb11dcda9a2060e585139d11	2019-02-28 18:38:34 -08:00
Michael Pratt	f7df9d72cf	Upgrade to Go 1.12 PiperOrigin-RevId: 236218980 Change-Id: I82cb4aeb2a56524ee1324bfea2ad41dce26db354	2019-02-28 16:26:14 -08:00
Ruidong Cao	a2b794b30d	FPE_INTOVF (integer overflow) should be 2 refer to Linux. Signed-off-by: Ruidong Cao <crdfrank@gmail.com> Change-Id: I03f8ab25cf29257b31f145cf43304525a93f3300 PiperOrigin-RevId: 235763203	2019-02-26 11:48:49 -08:00
Jamie Liu	0e84ae72e0	Improve safecopy sanity checks. - Fix CopyIn/CopyOut/ZeroOut range checks. - Include the faulting signal number in the panic message. PiperOrigin-RevId: 233829501 Change-Id: I8959ead12d05dbd4cd63c2b908cddeb2a27eb513	2019-02-13 14:25:15 -08:00
Michael Pratt	2a0c69b19f	Remove license comments Nothing reads them and they can simply get stale. Generated with: $ sed -i "s/licenses($.$)./licenses(\1)/" **/BUILD PiperOrigin-RevId: 231818945 Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0	2019-01-31 11:12:53 -08:00
Fabricio Voznika	03226cd950	Add BPFAction type with Stringer PiperOrigin-RevId: 226018694 Change-Id: I98965e26fe565f37e98e5df5f997363ab273c91b	2018-12-18 10:28:28 -08:00
Haibo Xu	52fe3b87a4	Add safecopy support for arm64 platform. Signed-off-by: Haibo Xu <haibo.xu@arm.com> Change-Id: I565214581eeb44045169da7f44d45a489082ac3a PiperOrigin-RevId: 224938170	2018-12-10 21:35:02 -08:00
Michael Pratt	99d5958693	Validate FS_BASE in Task.Clone arch_prctl already verified that the new FS_BASE was canonical, but Task.Clone did not. Centralize these checks in the arch packages. Failure to validate could cause an error in PTRACE_SET_REGS when we try to switch to the app. PiperOrigin-RevId: 224862398 Change-Id: Iefe63b3f9aa6c4810326b8936e501be3ec407f14	2018-12-10 12:37:16 -08:00
Michael Pratt	076f107643	Remove initRegs arg from clone It is always the same as t.initRegs. PiperOrigin-RevId: 224085550 Change-Id: I5cc4ddc3b481d4748c3c43f6f4bb50da1dbac694	2018-12-04 18:53:43 -08:00
Haibo Xu	9e0f132377	Add procid support for arm64 platform Change-Id: I7c3db8dfdf95a125d7384c1d67c3300dbb99a47e PiperOrigin-RevId: 223039923	2018-11-27 12:46:39 -08:00
Fabricio Voznika	eaac94d91c	Use RET_KILL_PROCESS if available in kernel RET_KILL_THREAD doesn't work well for Go because it will kill only the offending thread and leave the process hanging. RET_TRAP can be masked out and it's not guaranteed to kill the process. RET_KILL_PROCESS is available since 4.14. For older kernel, continue to use RET_TRAP as this is the best option (likely to kill process, easy to debug). PiperOrigin-RevId: 222357867 Change-Id: Icc1d7d731274b16c2125b7a1ba4f7883fbdb2cbd	2018-11-20 22:56:51 -08:00
Michael Pratt	03c1eb78b5	Reference upstream licenses Include copyright notices and the referenced LICENSE file. PiperOrigin-RevId: 222171321 Change-Id: I0cc0b167ca51b536d1087bf1c4742fdf1430bc2a	2018-11-20 14:05:16 -08:00
Adin Scannell	fb613020c7	kvm: simplify floating point logic. This reduces the number of floating point save/restore cycles required (since we don't need to restore immediately following the switch, this always happens in a known context) and allows the kernel hooks to capture state. This lets us remove calls like "Current()". PiperOrigin-RevId: 219552844 Change-Id: I7676fa2f6c18b9919718458aa888b832a7db8cab	2018-10-31 15:59:23 -07:00
Adin Scannell	c4bbb54168	kvm: add detailed traces on vCPU errors. This improves debuggability greatly. PiperOrigin-RevId: 219551560 Change-Id: I2ecaffdd1c17b0d9f25911538ea6f693e2bc699f	2018-10-31 15:50:10 -07:00
Adin Scannell	e9dbd5ab67	kvm: avoid siginfo allocations. PiperOrigin-RevId: 219492587 Change-Id: I47f6fc0b74a4907ab0aff03d5f26453bdb983bb5	2018-10-31 10:08:06 -07:00
Adin Scannell	0091db9cbd	kvm: use private futexes. Use private futexes for performance and to align with other runtime uses. PiperOrigin-RevId: 219422634 Change-Id: Ief2af5e8302847ea6dc246e8d1ee4d64684ca9dd	2018-10-30 22:46:42 -07:00
Adin Scannell	e7191f058f	Use TRAP to simplify vsyscall emulation. PiperOrigin-RevId: 218592058 Change-Id: I373a2d813aa6cc362500dd5a894c0b214a1959d7	2018-10-24 15:52:44 -07:00
Nicolas Lacasse	4a1a2dead9	Run ptrace stubs in their own session and process group. Pseudoterminal job control signals are meant to be received and handled by the sandbox process, but if the ptrace stubs are running in the same process group, they will receive the signals as well and inject then into the sentry kernel. This can result in duplicate signals being delivered (often to the wrong process), or a sentry panic if the ptrace stub is inactive. This CL makes the ptrace stub run in a new session. PiperOrigin-RevId: 218536851 Change-Id: Ie593c5687439bbfbf690ada3b2197ea71ed60a0e	2018-10-24 10:42:35 -07:00
Adin Scannell	75cd70ecc9	Track paths and provide a rename hook. This change also adds extensive testing to the p9 package via mocks. The sanity checks and type checks are moved from the gofer into the core package, where they can be more easily validated. PiperOrigin-RevId: 218296768 Change-Id: I4fc3c326e7bf1e0e140a454cbacbcc6fd617ab55	2018-10-23 00:20:15 -07:00
Ian Gudger	8fce67af24	Use correct company name in copyright header PiperOrigin-RevId: 217951017 Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035	2018-10-19 16:35:11 -07:00
Adin Scannell	463e73d46d	Add seccomp filter configuration to ptrace stubs. This is a defense-in-depth measure. If the sentry is compromised, this prevents system call injection to the stubs. There is some complexity with respect to ptrace and seccomp interactions, so this protection is not really available for kernel versions < 4.8; this is detected dynamically. Note that this also solves the vsyscall emulation issue by adding in appropriate trapping for those system calls. It does mean that a compromised sentry could theoretically inject these into the stub (ignoring the trap and resume, thereby allowing execution), but they are harmless. PiperOrigin-RevId: 216647581 Change-Id: Id06c232cbac1f9489b1803ec97f83097fcba8eb8	2018-10-10 22:40:28 -07:00
Fabricio Voznika	da20559137	Provide better message when memfd_create fails with ENOSYS Updates #100 PiperOrigin-RevId: 213414821 Change-Id: I90c2e6c18c54a6afcd7ad6f409f670aa31577d37	2018-09-18 02:09:28 -07:00
newmanwang	de5a590ee2	Avoid reuse of pending SignalInfo objects runApp.execute -> Task.SendSignal -> sendSignalLocked -> sendSignalTimerLocked -> pendingSignals.enqueue assumes that it owns the arch.SignalInfo returned from platform.Context.Switch. On the other hand, ptrace.context.Switch assumes that it owns the returned SignalInfo and can safely reuse it on the next call to Switch. The KVM platform always returns a unique SignalInfo. This becomes a problem when the returned signal is not immediately delivered, allowing a future signal in Switch to change the previous pending SignalInfo. This is noticeable in #38 when external SIGINTs are delivered from the PTY slave FD. Note that the ptrace stubs are in the same process group as the sentry, so they are eligible to receive the PTY signals. This should probably change, but is not the only possible cause of this bug. Updates #38 Original change by newmanwang <wcs1011@gmail.com>, updated by Michael Pratt <mpratt@google.com>. Change-Id: I5383840272309df70a29f67b25e8221f933622cd PiperOrigin-RevId: 213071072	2018-09-14 17:39:25 -07:00
Chenggang	faa34a0738	platform/kvm: Get max vcpu number dynamically by ioctl The old kernel version, such as 4.4, only support 255 vcpus. While gvisor is ran on these kernels, it could panic because the vcpu id and vcpu number beyond max_vcpus. Use ioctl(vmfd, _KVM_CHECK_EXTENSION, _KVM_CAP_MAX_VCPUS) to get max vcpus number dynamically. Change-Id: I50dd859a11b1c2cea854a8e27d4bf11a411aa45c PiperOrigin-RevId: 212929704	2018-09-13 21:47:11 -07:00
Nicolas Lacasse	6cc9b311af	platform: Pass device fd into platform constructor. We were previously openining the platform device (i.e. /dev/kvm) inside the platfrom constructor (i.e. kvm.New). This requires that we have RW access to the platform device when constructing the platform. However, now that the runsc sandbox process runs as user "nobody", it is not able to open the platform device. This CL changes the kvm constructor to take the platform device FD, rather than opening the device file itself. The device file is opened outside of the sandbox and passed to the sandbox process. PiperOrigin-RevId: 212505804 Change-Id: I427e1d9de5eb84c84f19d513356e1bb148a52910	2018-09-11 13:09:46 -07:00
Jamie Liu	a29c39aa62	Map committed chunks concurrently in FileMem.LoadFrom. PiperOrigin-RevId: 212345401 Change-Id: Iac626ee87ba312df88ab1019ade6ecd62c04c75c	2018-09-10 15:23:44 -07:00
Michael Pratt	25a8e13a78	Bump to Go 1.11 The procid offset is unchanged. PiperOrigin-RevId: 210551969 Change-Id: I33ba1ce56c2f5631b712417d870aa65ef24e6022	2018-08-28 09:22:41 -07:00
Adin Scannell	a7a8d07d7d	Add separate Recycle method for allocator. This improves debugging for pagetable-related issues. PiperOrigin-RevId: 209827795 Change-Id: I4cfa11664b0b52f26f6bc90a14c5bb106f01e038	2018-08-22 14:16:04 -07:00
Adin Scannell	dbbe9ec915	Protect PCIDs with a mutex. Because the Drop method may be called across vCPUs, it is necessary to protect the PCID database with a mutex to prevent concurrent modification. The PCID is assigned prior to entersyscall, so it's safe to block. PiperOrigin-RevId: 207992864 Change-Id: I8b36d55106981f51e30dcf03e12886330bb79d67	2018-08-08 21:29:19 -07:00
ShiruRen	3ec074897f	Fix a bug in PCIDs.Assign Store the new assigned pcid in p.cache[pt]. Signed-off-by: ShiruRen <renshiru2000@gmail.com> Change-Id: I4aee4e06559e429fb5e90cb9fe28b36139e3b4b6 PiperOrigin-RevId: 207563833	2018-08-06 10:11:56 -07:00
Zhaozhong Ni	57d0fcbdbf	Automated rollback of changelist 207037226 PiperOrigin-RevId: 207125440 Change-Id: I6c572afb4d693ee72a0c458a988b0e96d191cd49	2018-08-02 10:42:48 -07:00
Michael Pratt	60add78980	Automated rollback of changelist 207007153 PiperOrigin-RevId: 207037226 Change-Id: I8b5f1a056d4f3eab17846f2e0193bb737ecb5428	2018-08-01 19:57:32 -07:00
Zhaozhong Ni	b9e1cf8404	stateify: convert all packages to use explicit mode. PiperOrigin-RevId: 207007153 Change-Id: Ifedf1cc3758dc18be16647a4ece9c840c1c636c9	2018-08-01 15:43:24 -07:00
Zhaozhong Ni	be7fcbc558	stateify: support explicit annotation mode; convert refs and stack packages. We have been unnecessarily creating too many savable types implicitly. PiperOrigin-RevId: 206334201 Change-Id: Idc5a3a14bfb7ee125c4f2bb2b1c53164e46f29a8	2018-07-27 10:17:21 -07:00
Fabricio Voznika	d7a34790a0	Add KVM and overlay dimensions to container_test PiperOrigin-RevId: 205714667 Change-Id: I317a2ca98ac3bdad97c4790fcc61b004757d99ef	2018-07-23 13:31:42 -07:00
Michael Pratt	733ebe7c09	Merge FileMem.usage in IncRef Per the doc, usage must be kept maximally merged. Beyond that, it is simply a good idea to keep fragmentation in usage to a minimum. The glibc malloc allocator allocates one page at a time, potentially causing lots of fragmentation. However, those pages are likely to have the same number of references, often making it possible to merge ranges. PiperOrigin-RevId: 204960339 Change-Id: I03a050cf771c29a4f05b36eaf75b1a09c9465e14	2018-07-17 13:03:59 -07:00
Adin Scannell	29e00c943a	Add CPUID faulting for ptrace and KVM. PiperOrigin-RevId: 204858314 Change-Id: I8252bf8de3232a7a27af51076139b585e73276d4	2018-07-16 22:02:58 -07:00
Michael Pratt	14d06064d2	Start allocation and reclaim scans only where they may find a match If usageSet is heavily fragmented, findUnallocatedRange and findReclaimable can spend excessive cycles linearly scanning the set for unallocated/free pages. Improve common cases by beginning the scan only at the first page that could possibly contain an unallocated/free page. This metadata only guarantees that there is no lower unallocated/free page, but a scan may still be required (especially for multi-page allocations). That said, this heuristic can still provide significant performance improvements for certain applications. PiperOrigin-RevId: 204841833 Change-Id: Ic41ad33bf9537ecd673a6f5852ab353bf63ea1e6	2018-07-16 18:19:01 -07:00
Jamie Liu	ee0ef506d4	Add MemoryManager.Pin. PiperOrigin-RevId: 204162313 Change-Id: Ib0593dde88ac33e222c12d0dca6733ef1f1035dc	2018-07-11 11:52:09 -07:00
Adin Scannell	dc33d71f8c	Change SIGCHLD to SIGKILL in ptrace stubs. If the child stubs are killed by any unmaskable signal (e.g. SIGKILL), then the parent process will similarly be killed, resulting in the death of all other stubs. The effect of this is that if the OOM killer selects and kills a stub, the effect is the same as though the OOM killer selected and killed the sentry. PiperOrigin-RevId: 202219984 Change-Id: I0b638ce7e59e0a0f4d5cde12a7d05242673049d7	2018-06-26 16:54:44 -07:00
Adin Scannell	be76cad5bc	Make KVM more scalable by removing CPU cap. Instead, CPUs will be created dynamically. We also allow a relatively efficient mechanism for stealing and notifying when a vCPU becomes available via unlock. Since the number of vCPUs is no longer fixed at machine creation time, we make the dirtySet packing more efficient. This has the pleasant side effect of cutting out the unsafe address space code. PiperOrigin-RevId: 201266691 Change-Id: I275c73525a4f38e3714b9ac0fd88731c26adfe66	2018-06-19 17:00:30 -07:00
Adin Scannell	b31ac4e1df	Use notify explicitly on unlock path. There are circumstances under which the redpill call will not generate the appropriate action and notification. Replace this call with an explicit notification, which is guaranteed to transition as well as perform the futex wake. PiperOrigin-RevId: 200726934 Change-Id: Ie19e008a6007692dd7335a31a8b59f0af6e54aaa	2018-06-15 09:30:08 -07:00
Adin Scannell	7b7b199ed0	Deflake kvm_test. PiperOrigin-RevId: 200439846 Change-Id: I9970fe0716cb02f0f41b754891d55db7e0729f56	2018-06-13 13:05:33 -07:00
Jamie Liu	55b9058456	Log filemem state when panicing due to invalid refcount. PiperOrigin-RevId: 200408305 Change-Id: I676ee49ec77697105723577928c7f82088cd378e	2018-06-13 10:03:54 -07:00
Adin Scannell	41f766893a	Minor ring0 interface cleanup. - Remove unused methods. - Provide declaration for asm function. PiperOrigin-RevId: 200146850 Change-Id: Ic455c96ffe0d2e78ef15f824eb65d7de705b054a	2018-06-11 18:17:15 -07:00
Adin Scannell	1397a413b4	Make page tables split-safe. In order to minimize the likelihood of exit during page table modifications, make the full set of page table functions split-safe. This is not strictly necessary (and you may still incur splits due to allocations from the allocator pool) but should make retries a very rare occurance. PiperOrigin-RevId: 200146688 Change-Id: I8fa36aa16b807beda2f0b057be60038258e8d597	2018-06-11 18:15:14 -07:00
Adin Scannell	09b0a9c320	Handle all exception vectors. PiperOrigin-RevId: 200144655 Change-Id: I5a753c74b75007b7714d6fe34aa0d2e845dc5c41	2018-06-11 17:57:19 -07:00
Adin Scannell	c0ab059e7b	Fix kernel flags handling and add missing vectors. PiperOrigin-RevId: 199877174 Change-Id: I9d19ea301608c2b989df0a6123abb1e779427853	2018-06-08 17:51:50 -07:00
Adin Scannell	d269845159	Ensure guest-mode for page table modifications. Because of the KVM shadow page table implementation, modifications made to guest page tables from host mode may not be syncronized correctly, resulting in undefined behavior. This is a KVM bug: page table pages should also be tracked for host modifications and resynced appropriately (e.g. the guest could "DMA" into a page table page in theory). However, since we can't rely on this being fixed everywhere, workaround the issue by forcing page table modifications to be in guest mode. This will generally be the case anyways, but now if an exit occurs during modifications, we will re-enter and perform the modifications again. PiperOrigin-RevId: 199587895 Change-Id: I83c20b4cf2a9f9fa56f59f34939601dd34538fb0	2018-06-06 23:26:14 -07:00

1 2

67 Commits