Commit Graph

51 Commits

Author SHA1 Message Date
Michael Pratt df5d377521 Remove go_test from go_stateify and go_marshal
They are no-ops, so the standard rule works fine.

PiperOrigin-RevId: 268776264
2019-09-12 15:10:17 -07:00
Ayush Ranjan 661b2b9f69 procfs: Migrate seqfile implementations.
Migrates all (except 3) seqfile implementations to the vfs.DynamicBytesSource
interface. There should not be any change in functionality due to this migration
itself.

Please note that the following seqfile implementations have not been migrated:
- /proc/filesystems in proc/filesystems.go
- /proc/[pid]/mountinfo in proc/mounts.go
- /proc/[pid]/mounts in proc/mounts.go
This is because these depend on pending changes in /pkg/senty/vfs.

PiperOrigin-RevId: 263880719
2019-08-16 17:36:42 -07:00
Ian Gudger 45566fa4e4 Add finalizer on AtomicRefCount to check for leaks.
PiperOrigin-RevId: 255711454
2019-06-28 20:07:52 -07:00
Michael Pratt 5b41ba5d0e Fix various spelling issues in the documentation
Addresses obvious typos, in the documentation only.

COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/443 from Pixep:fix/documentation-spelling 4d0688164eafaf0b3010e5f4824b35d1e7176d65
PiperOrigin-RevId: 255477779
2019-06-27 14:25:50 -07:00
Neel Natu 0b2135072d Implement madvise(MADV_DONTFORK)
PiperOrigin-RevId: 254253777
2019-06-20 12:56:00 -07:00
Adin Scannell add40fd6ad Update canonical repository.
This can be merged after:
https://github.com/google/gvisor-website/pull/77
  or
https://github.com/google/gvisor-website/pull/78

PiperOrigin-RevId: 253132620
2019-06-13 16:50:15 -07:00
Jamie Liu b3f104507d "Implement" mbind(2).
We still only advertise a single NUMA node, and ignore mempolicy
accordingly, but mbind() at least now succeeds and has effects reflected
by get_mempolicy().

Also fix handling of nodemasks: round sizes to unsigned long (as
documented and done by Linux), and zero trailing bits when copying them
out.

PiperOrigin-RevId: 251950859
2019-06-06 16:29:46 -07:00
Michael Pratt d3ed9baac0 Implement dumpability tracking and checks
We don't actually support core dumps, but some applications want to
get/set dumpability, which still has an effect in procfs.

Lack of support for set-uid binaries or fs creds simplifies things a
bit.

As-is, processes started via CreateProcess (i.e., init and sentryctl
exec) have normal dumpability. I'm a bit torn on whether sentryctl exec
tasks should be dumpable, but at least since they have no parent normal
UID/GID checks should protect them.

PiperOrigin-RevId: 251712714
2019-06-05 14:00:13 -07:00
Andrei Vagin 8e926e3f74 gvisor: validate a new map region in the mremap syscall
Right now, mremap allows to remap a memory region over MaxUserAddress,
this means that we can change the stub region.

PiperOrigin-RevId: 251266886
2019-06-03 10:59:46 -07:00
chris.zn b18df9bed6 Add VmData field to /proc/{pid}/status
VmData is the size of private data segments.
It has the same meaning as in Linux.

Change-Id: Iebf1ae85940a810524a6cde9c2e767d4233ddb2a
PiperOrigin-RevId: 250593739
2019-05-30 12:07:40 -07:00
Jamie Liu 14f0e7618e Ensure all uses of MM.brk occur under MM.mappingMu in MM.Brk().
PiperOrigin-RevId: 246921386
Change-Id: I71d8908858f45a9a33a0483470d0240eaf0fd012
2019-05-06 16:39:43 -07:00
Michael Pratt 4d52a55201 Change copyright notice to "The gVisor Authors"
Based on the guidelines at
https://opensource.google.com/docs/releasing/authors/.

1. $ rg -l "Google LLC" | xargs sed -i 's/Google LLC.*/The gVisor Authors./'
2. Manual fixup of "Google Inc" references.
3. Add AUTHORS file. Authors may request to be added to this file.
4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS.

Fixes #209

PiperOrigin-RevId: 245823212
Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
2019-04-29 14:26:23 -07:00
Nicolas Lacasse f4ce43e1f4 Allow and document bug ids in gVisor codebase.
PiperOrigin-RevId: 245818639
Change-Id: I03703ef0fb9b6675955637b9fe2776204c545789
2019-04-29 14:04:14 -07:00
Michael Pratt cc48969bb7 Internal change
PiperOrigin-RevId: 242978508
Change-Id: I0ea59ac5ba1dd499e87c53f2e24709371048679b
2019-04-10 18:00:18 -07:00
Kevin Krakauer f7aff0aaa4 Allow threads with CAP_SYS_RESOURCE to raise hard rlimits.
PiperOrigin-RevId: 242919489
Change-Id: Ie3267b3bcd8a54b54bc16a6556369a19e843376f
2019-04-10 12:36:45 -07:00
Jamie Liu b4006686d2 Don't expand COW-break on executable VMAs.
PiperOrigin-RevId: 241403847
Change-Id: I4631ca05734142da6e80cdfa1a1d63ed68aa05cc
2019-04-01 14:47:31 -07:00
Michael Pratt d11ef20a93 Drop reference on shared anon mappable
We call NewSharedAnonMappable simply to use it for Mappable/MappingIdentity for
shared anon mmap. From MMapOpts.MappingIdentity: "If MMapOpts is used to
successfully create a memory mapping, a reference is taken on MappingIdentity."

mm.createVMALocked (below) takes this additional reference, so we don't need
the reference returned by NewSharedAnonMappable. Holding it leaks the mappable.

PiperOrigin-RevId: 241038108
Change-Id: I78ee3af78e0cc7aac4063b274b30d0e41eb5677d
2019-03-29 13:17:56 -07:00
Jamie Liu f3723f8059 Call memmap.Mappable.Translate with more conservative usermem.AccessType.
MM.insertPMAsLocked() passes vma.maxPerms to memmap.Mappable.Translate
(although it unsets AccessType.Write if the vma is private). This
somewhat simplifies handling of pmas, since it means only COW-break
needs to replace existing pmas. However, it also means that a MAP_SHARED
mapping of a file opened O_RDWR dirties the file, regardless of the
mapping's permissions and whether or not the mapping is ever actually
written to with I/O that ignores permissions (e.g.
ptrace(PTRACE_POKEDATA)).

To fix this:

- Change the pma-getting path to request only the permissions that are
required for the calling access.

- Change memmap.Mappable.Translate to take requested permissions, and
return allowed permissions. This preserves the existing behavior in the
common cases where the memmap.Mappable isn't
fsutil.CachingInodeOperations and doesn't care if the translated
platform.File pages are written to.

- Change the MM.getPMAsLocked path to support permission upgrading of
pmas outside of copy-on-write.

PiperOrigin-RevId: 240196979
Change-Id: Ie0147c62c1fbc409467a6fa16269a413f3d7d571
2019-03-25 12:42:43 -07:00
Jamie Liu 8f4634997b Decouple filemem from platform and move it to pgalloc.MemoryFile.
This is in preparation for improved page cache reclaim, which requires
greater integration between the page cache and page allocator.

PiperOrigin-RevId: 238444706
Change-Id: Id24141b3678d96c7d7dc24baddd9be555bffafe4
2019-03-14 08:12:48 -07:00
Jamie Liu 8930e79ebf Clarify the platform.File interface.
- Redefine some memmap.Mappable, platform.File, and platform.Memory
semantics in terms of File reference counts (no functional change).

- Make AddressSpace.MapFile take a platform.File instead of a raw FD,
and replace platform.File.MapInto with platform.File.FD. This allows
kvm.AddressSpace.MapFile to always use platform.File.MapInternal instead
of maintaining its own (redundant) cache of file mappings in the sentry
address space.

PiperOrigin-RevId: 238044504
Change-Id: Ib73a11e4275c0da0126d0194aa6c6017a9cef64f
2019-03-12 10:29:16 -07:00
Fabricio Voznika 0b76887147 Priority-inheritance futex implementation
It is Implemented without the priority inheritance part given
that gVisor defers scheduling decisions to Go runtime and doesn't
have control over it.

PiperOrigin-RevId: 236989545
Change-Id: I714c8ca0798743ecf3167b14ffeb5cd834302560
2019-03-05 23:40:18 -08:00
Michael Pratt fe1369ac98 Move package sync to third_party
PiperOrigin-RevId: 231889261
Change-Id: I482f1df055bcedf4edb9fe3fe9b8e9c80085f1a0
2019-01-31 17:49:14 -08:00
Michael Pratt 2a0c69b19f Remove license comments
Nothing reads them and they can simply get stale.

Generated with:
$ sed -i "s/licenses(\(.*\)).*/licenses(\1)/" **/BUILD

PiperOrigin-RevId: 231818945
Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0
2019-01-31 11:12:53 -08:00
Jamie Liu 901ed5da44 Implement /proc/[pid]/smaps.
PiperOrigin-RevId: 228245523
Change-Id: I5a4d0a6570b93958e51437e917e5331d83e23a7e
2019-01-07 15:17:44 -08:00
Jamie Liu 9a442fa4b5 Automated rollback of changelist 226224230
PiperOrigin-RevId: 226493053
Change-Id: Ia98d1cb6dd0682049e4d907ef69619831de5c34a
2018-12-21 08:23:34 -08:00
Googler 86c9bd2547 Automated rollback of changelist 225861605
PiperOrigin-RevId: 226224230
Change-Id: Id24c7d3733722fd41d5fe74ef64e0ce8c68f0b12
2018-12-19 13:30:08 -08:00
Jamie Liu 898838e34d Fix mremap expansion with mm.checkInvariants = true.
Also remove useless RSS changes in mm.movePMAsLocked().

PiperOrigin-RevId: 226052996
Change-Id: If59fd259b93238fb2f15c1c8ebfeda14cb590a87
2018-12-18 13:50:33 -08:00
Jamie Liu 3b3f026278 Truncate ar before calling mm.breakCopyOnWriteLocked().
... as required by the latter's precondition.

PiperOrigin-RevId: 226033824
Change-Id: I6bc46d0e100c61cc58cb5fc69e70c4ca905cd92d
2018-12-18 11:52:31 -08:00
Jamie Liu 2421006426 Implement mlock(), kind of.
Currently mlock() and friends do nothing whatsoever. However, mlocking
is directly application-visible in a number of ways; for example,
madvise(MADV_DONTNEED) and msync(MS_INVALIDATE) both fail on mlocked
regions. We handle this inconsistently: MADV_DONTNEED is too important
to not work, but MS_INVALIDATE is rejected.

Change MM to track mlocked regions in a manner consistent with Linux.
It still will not actually pin pages into host physical memory, but:

- mlock() will now cause sentry memory management to precommit mlocked
pages.

- MADV_DONTNEED and MS_INVALIDATE will interact with mlocked pages as
described above.

PiperOrigin-RevId: 225861605
Change-Id: Iee187204979ac9a4d15d0e037c152c0902c8d0ee
2018-12-17 11:38:59 -08:00
Rahat Mahmood 75e39eaa74 Pass information about map writableness to filesystems.
This is necessary to implement file seals for memfds.

PiperOrigin-RevId: 225239394
Change-Id: Ib3f1ab31385afc4b24e96cd81a05ef1bebbcbb70
2018-12-12 13:09:59 -08:00
Jamie Liu 23438b3632 Update MM.usageAS when mremap copies or moves a mapping.
PiperOrigin-RevId: 224221509
Change-Id: I7aaea74629227d682786d3e435737364921249bf
2018-12-05 14:27:23 -08:00
Adin Scannell bb9a2bb62e Update futex to use usermem abstractions.
This eliminates the indirection that existed in task_futex.

PiperOrigin-RevId: 221832498
Change-Id: Ifb4c926d493913aa6694e193deae91616a29f042
2018-11-20 14:02:07 -08:00
Rahat Mahmood f7aa937124 Advertise vsyscall support via /proc/<pid>/maps.
Also update test utilities for probing vsyscall support and add a
metric to see if vsyscalls are actually used in sandboxes.

PiperOrigin-RevId: 221698834
Change-Id: I57870ecc33ea8c864bd7437833f21aa1e8117477
2018-11-15 15:14:38 -08:00
Ian Gudger 8fce67af24 Use correct company name in copyright header
PiperOrigin-RevId: 217951017
Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
2018-10-19 16:35:11 -07:00
Jamie Liu e9e8be6613 Implement shared futexes.
- Shared futex objects on shared mappings are represented by Mappable +
  offset, analogous to Linux's use of inode + offset. Add type
  futex.Key, and change the futex.Manager bucket API to use futex.Keys
  instead of addresses.

- Extend the futex.Checker interface to be able to return Keys for
  memory mappings. It returns Keys rather than just mappings because
  whether the address or the target of the mapping is used in the Key
  depends on whether the mapping is MAP_SHARED or MAP_PRIVATE; this
  matters because using mapping target for a futex on a MAP_PRIVATE
  mapping causes it to stop working across COW-breaking.

- futex.Manager.WaitComplete depends on atomic updates to
  futex.Waiter.addr to determine when it has locked the right bucket,
  which is much less straightforward for struct futex.Waiter.key. Switch
  to an atomically-accessed futex.Waiter.bucket pointer.

- futex.Manager.Wake now needs to take a futex.Checker to resolve
  addresses for shared futexes. CLONE_CHILD_CLEARTID requires the exit
  path to perform a shared futex wakeup (Linux:
  kernel/fork.c:mm_release() => sys_futex(tsk->clear_child_tid,
  FUTEX_WAKE, ...)). This is a problem because futexChecker is in the
  syscalls/linux package. Move it to kernel.

PiperOrigin-RevId: 216207039
Change-Id: I708d68e2d1f47e526d9afd95e7fed410c84afccf
2018-10-08 10:20:38 -07:00
Fabricio Voznika 7e9e6745ca Allow '/dev/zero' to be mapped with unaligned length
PiperOrigin-RevId: 212321271
Change-Id: I79d71c2e6f4b8fcd3b9b923fe96c2256755f4c48
2018-09-10 13:24:55 -07:00
Adin Scannell c09f9acd7c Distinguish Element and Linker for ilist.
Furthermore, allow for the specification of an ElementMapper. This allows a
single "Element" type to exist on multiple inline lists, and work without
having to embed the entry type.

This is a requisite change for supporting a per-Inode list of Dirents.

PiperOrigin-RevId: 211467497
Change-Id: If2768999b43e03fdaecf8ed15f435fe37518d163
2018-09-04 09:19:11 -07:00
Zhaozhong Ni 57d0fcbdbf Automated rollback of changelist 207037226
PiperOrigin-RevId: 207125440
Change-Id: I6c572afb4d693ee72a0c458a988b0e96d191cd49
2018-08-02 10:42:48 -07:00
Michael Pratt 60add78980 Automated rollback of changelist 207007153
PiperOrigin-RevId: 207037226
Change-Id: I8b5f1a056d4f3eab17846f2e0193bb737ecb5428
2018-08-01 19:57:32 -07:00
Zhaozhong Ni b9e1cf8404 stateify: convert all packages to use explicit mode.
PiperOrigin-RevId: 207007153
Change-Id: Ifedf1cc3758dc18be16647a4ece9c840c1c636c9
2018-08-01 15:43:24 -07:00
Zhaozhong Ni be7fcbc558 stateify: support explicit annotation mode; convert refs and stack packages.
We have been unnecessarily creating too many savable types implicitly.

PiperOrigin-RevId: 206334201
Change-Id: Idc5a3a14bfb7ee125c4f2bb2b1c53164e46f29a8
2018-07-27 10:17:21 -07:00
Michael Pratt 41e0b977e5 Format documentation
PiperOrigin-RevId: 204323728
Change-Id: I1ff9aa062ffa12583b2e38ec94c87db7a3711971
2018-07-12 10:37:21 -07:00
Jamie Liu ee0ef506d4 Add MemoryManager.Pin.
PiperOrigin-RevId: 204162313
Change-Id: Ib0593dde88ac33e222c12d0dca6733ef1f1035dc
2018-07-11 11:52:09 -07:00
Nicolas Lacasse 99afc982f1 Call mm.CheckIORange() when copying in IOVecs.
CheckIORange is analagous to Linux's access_ok() method, which is checked when
copying in IOVecs in both lib/iov_iter.c:import_single_range() and
lib/iov_iter.c:import_iovec() => fs/read_write.c:rw_copy_check_uvector().

gVisor copies in IOVecs via Task.SingleIOSequence() and Task.CopyInIovecs().
We were checking the address range bounds, but not whether the address is
valid. To conform with linux, we should also check that the address is valid.

For usual preadv/pwritev syscalls, the effect of this change is not noticeable,
since we find out that the address is invalid before the syscall completes.

For vectorized async-IO operations, however, this change is necessary because
Linux returns EFAULT when the operation is submitted, but before it executes.
Thus, we must validate the iovecs when copying them in.

PiperOrigin-RevId: 202370092
Change-Id: I8759a63ccf7e6b90d90d30f78ab8935a0fcf4936
2018-06-27 14:31:35 -07:00
Jamie Liu 16882484f9 Check for empty applicationAddrRange in MM.DecUsers.
PiperOrigin-RevId: 202043776
Change-Id: I4373abbcf735dc1cf4bebbbbb0c7124df36e9e78
2018-06-25 16:50:38 -07:00
Jamie Liu fe3fc44da3 Handle mremap(old_size=0).
PiperOrigin-RevId: 201729703
Change-Id: I486900b0c6ec59533b88da225a5829c474e35a70
2018-06-22 13:08:38 -07:00
Rahat Mahmood 8878a66a56 Implement sysv shm.
PiperOrigin-RevId: 197058289
Change-Id: I3946c25028b7e032be4894d61acb48ac0c24d574
2018-05-17 15:06:19 -07:00
Michael Pratt 8deabbaae1 Remove error return from AddressSpace.Release()
PiperOrigin-RevId: 196291289
Change-Id: Ie3487be029850b0b410b82416750853a6c4a2b00
2018-05-11 12:24:15 -07:00
Jamie Liu 12c161f278 Implement MAP_32BIT.
PiperOrigin-RevId: 196281052
Change-Id: Ie620a0f983a1bf2570d0003d4754611879335c1c
2018-05-11 11:18:31 -07:00
Cyrille Hemidy 04b79137ba Fix misspellings.
PiperOrigin-RevId: 195307689
Change-Id: I499f19af49875a43214797d63376f20ae788d2f4
2018-05-03 14:06:13 -07:00