Commit Graph

218 Commits

Author SHA1 Message Date
Nicolas Lacasse a38f41b464 fs: Add new cache policy "remote_revalidate".
This CL adds a new cache-policy for gofer filesystems that uses the host page
cache, but causes dirents to be reloaded on each Walk, and does not cache
readdir results.

This policy is useful when the remote filesystem may change out from underneath
us, as any remote changes will be reflected on the next Walk.

Importantly, this cache policy is only consistent if we do not use gVisor's
internal page cache, since that page cache is tied to the Inode and may be
thrown away upon Revalidation.

This cache policy should only be used when the gofer supports donating host
FDs, since then gVisor will make use of the host kernel page cache, which will
be consistent for all open files in the gofer. In fact, a panic will be raised
if a file is opened without a donated FD.

PiperOrigin-RevId: 207752937
Change-Id: I233cb78b4695bbe00a4605ae64080a47629329b8
2018-08-07 11:43:41 -07:00
Zhaozhong Ni c348d07863 sentry: make epoll.pollEntry wait for the file operation in restore.
PiperOrigin-RevId: 207737935
Change-Id: I3a301ece1f1d30909715f36562474e3248b6a0d5
2018-08-07 10:27:37 -07:00
Brian Geffon d839dc13c6 Netstack doesn't handle sending after SHUT_WR correctly.
PiperOrigin-RevId: 207715032
Change-Id: I7b6690074c5be283145192895d706a92e921b22c
2018-08-07 07:57:20 -07:00
Michael Pratt 42086fe8e1 Make ramfs.File savable
In other news, apparently proc.fdInfo is the last user of ramfs.File.

PiperOrigin-RevId: 207564572
Change-Id: I5a92515698cc89652b80bea9a32d309e14059869
2018-08-06 10:15:56 -07:00
ShiruRen 3ec074897f Fix a bug in PCIDs.Assign
Store the new assigned pcid in p.cache[pt].

Signed-off-by: ShiruRen <renshiru2000@gmail.com>

Change-Id: I4aee4e06559e429fb5e90cb9fe28b36139e3b4b6
PiperOrigin-RevId: 207563833
2018-08-06 10:11:56 -07:00
Bhasker Hariharan 56fa562dda Cubic implementation for Netstack.
This CL implements CUBIC as described in https://tools.ietf.org/html/rfc8312.

PiperOrigin-RevId: 207353142
Change-Id: I329cbf3277f91127e99e488f07d906f6779c6603
2018-08-03 17:54:42 -07:00
Zhaozhong Ni 25178ebdf5 stateify: make explicit mode no longer optional.
PiperOrigin-RevId: 207303405
Change-Id: I17b6433963d78e3631a862b7ac80f566c8e7d106
2018-08-03 12:09:13 -07:00
Michael Pratt a3927157c5 Copy creds in access
PiperOrigin-RevId: 207181631
Change-Id: Ic6205278715a9260fb970efb414fc758ea72c4c6
2018-08-02 16:01:31 -07:00
Michael Pratt b6a37ab9d9 Update comment reference
PiperOrigin-RevId: 207180809
Change-Id: I08c264812919e81b2c56fdd4a9ef06924de8b52f
2018-08-02 15:56:40 -07:00
Zhaozhong Ni 57d0fcbdbf Automated rollback of changelist 207037226
PiperOrigin-RevId: 207125440
Change-Id: I6c572afb4d693ee72a0c458a988b0e96d191cd49
2018-08-02 10:42:48 -07:00
Brian Geffon cf44aff6e0 Add seccomp(2) support.
Add support for the seccomp syscall and the flag SECCOMP_FILTER_FLAG_TSYNC.

PiperOrigin-RevId: 207101507
Change-Id: I5eb8ba9d5ef71b0e683930a6429182726dc23175
2018-08-02 08:10:30 -07:00
Ian Gudger 3cd7824410 Move stack clock to options struct
PiperOrigin-RevId: 207039273
Change-Id: Ib8f55a6dc302052ab4a10ccd70b07f0d73b373df
2018-08-01 20:22:02 -07:00
Michael Pratt 60add78980 Automated rollback of changelist 207007153
PiperOrigin-RevId: 207037226
Change-Id: I8b5f1a056d4f3eab17846f2e0193bb737ecb5428
2018-08-01 19:57:32 -07:00
Zhaozhong Ni b9e1cf8404 stateify: convert all packages to use explicit mode.
PiperOrigin-RevId: 207007153
Change-Id: Ifedf1cc3758dc18be16647a4ece9c840c1c636c9
2018-08-01 15:43:24 -07:00
Brielle Broder 6b87378634 New conditional for adding key/value pairs to maps.
When adding MultiDeviceKeys and their values into MultiDevice maps, make
sure the keys and values have not already been added. This ensures that
preexisting key/value pairs are not overridden.

PiperOrigin-RevId: 206942766
Change-Id: I9d85f38eb59ba59f0305e6614a52690608944981
2018-08-01 09:44:57 -07:00
Andrei Vagin a7a0167716 proc: show file flags in fdinfo
Currently, there is an attempt to print FD flags, but
they are not decoded into a number, so we see something like this:

/criu # cat /proc/self/fdinfo/0
flags: {%!o(bool=000false)}

Actually, fdinfo has to contain file flags.

Change-Id: Idcbb7db908067447eb9ae6f2c3cfb861f2be1a97
PiperOrigin-RevId: 206794498
2018-07-31 11:19:15 -07:00
Zhaozhong Ni 0a55f8c1c1 netstack: support disconnect-on-save option per fdbased link.
PiperOrigin-RevId: 206659972
Change-Id: I5e0e035f97743b6525ad36bed2c802791609beaf
2018-07-30 15:43:25 -07:00
Justine Olshan 2793f7ac5f Added the O_LARGEFILE flag.
This flag will always be true for gVisor files.

PiperOrigin-RevId: 206355963
Change-Id: I2f03d2412e2609042df43b06d1318cba674574d0
2018-07-27 12:27:46 -07:00
Zhaozhong Ni be7fcbc558 stateify: support explicit annotation mode; convert refs and stack packages.
We have been unnecessarily creating too many savable types implicitly.

PiperOrigin-RevId: 206334201
Change-Id: Idc5a3a14bfb7ee125c4f2bb2b1c53164e46f29a8
2018-07-27 10:17:21 -07:00
Nicolas Lacasse 127c977ab0 Don't copy-up extended attributes that specifically configure a lower overlay.
When copying-up files from a lower fs to an upper, we also copy the extended
attributes on the file. If there is a (nested) overlay inside the lower, some
of these extended attributes configure the lower overlay, and should not be
copied-up to the upper.

In particular, whiteout attributes in the lower fs overlay should not be
copied-up, since the upper fs may actually contain the file.

PiperOrigin-RevId: 206236010
Change-Id: Ia0454ac7b99d0e11383f732a529cb195ed364062
2018-07-26 15:55:50 -07:00
Michael Pratt 7cd9405b9c Format openat flags
PiperOrigin-RevId: 206021774
Change-Id: I447b6c751c28a8d8d4d78468b756b6ad8c61e169
2018-07-25 11:07:19 -07:00
Kevin Krakauer 32aa0f5465 Typo fix.
PiperOrigin-RevId: 205880843
Change-Id: If2272b25f08a18ebe9b6309a1032dd5cdaa59866
2018-07-24 13:26:06 -07:00
Bhasker Hariharan da48c04d0d Refactor new reno congestion control logic out of sender.
This CL also puts the congestion control logic behind an
interface so that we can easily swap it out for say CUBIC
in the future.

PiperOrigin-RevId: 205732848
Change-Id: I891cdfd17d4d126b658b5faa0c6bd6083187944b
2018-07-23 15:15:07 -07:00
Fabricio Voznika d7a34790a0 Add KVM and overlay dimensions to container_test
PiperOrigin-RevId: 205714667
Change-Id: I317a2ca98ac3bdad97c4790fcc61b004757d99ef
2018-07-23 13:31:42 -07:00
Michael Pratt 5f134b3c0a Format getcwd path
PiperOrigin-RevId: 205440332
Change-Id: I2a838f363e079164c83da88e1b0b8769844fe79b
2018-07-20 12:59:41 -07:00
Adin Scannell 8b8aad91d5 kernel: mutations on creds now require a copy.
PiperOrigin-RevId: 205315612
Change-Id: I9a0a1e32c8abfb7467a38743b82449cc92830316
2018-07-19 15:48:56 -07:00
Nicolas Lacasse be431d0934 fs: Pass context to Revalidate() function.
The current revalidation logic is very simple and does not do much
introspection of the dirent being revalidated (other than looking at the type
of file).

Fancier revalidation logic is coming soon, and we need to be able to look at
the cached and uncached attributes of a given dirent, and we need a context to
perform some of these operations.

PiperOrigin-RevId: 205307351
Change-Id: If17ea1c631d8f9489c0e05a263e23d7a8a3bf159
2018-07-19 14:57:52 -07:00
Nicolas Lacasse ea37103196 ConfigureMMap on an overlay file delegates to the upper if there is no lower.
In the general case with an overlay, all mmap calls must go through the
overlay, because in the event of a copy-up, the overlay needs to invalidate any
previously-created mappings.

If there if no lower file, however, there will never be a copy-up, so the
overlay can delegate directly to the upper file in that case.

This also allows us to correctly mmap /dev/zero when it is in an overlay. This
file has special semantics which the overlay does not know about. In
particular, it does not implement Mappable(), which (in the general case) the
overlay uses to detect if a file is mappable or not.

PiperOrigin-RevId: 205306743
Change-Id: I92331649aa648340ef6e65411c2b42c12fa69631
2018-07-19 14:53:38 -07:00
Brian Geffon df5a5d388e Add AT_UID, AT_EUID, AT_GID, AT_EGID to aux vector.
With musl libc when these entries are missing from the aux vector
it's forcing libc.secure (effectively AT_SECURE). This mode prevents
RPATH and LD_LIBRARY_PATH from working.

https://git.musl-libc.org/cgit/musl/tree/ldso/dynlink.c#n1488
As the first entry is a mask of all the aux fields set:
https://git.musl-libc.org/cgit/musl/tree/ldso/dynlink.c#n187

PiperOrigin-RevId: 205284684
Change-Id: I04de7bab241043306b4f732306a81d74edfdff26
2018-07-19 12:42:05 -07:00
Zhaozhong Ni a95640b1e9 sentry: save stack in proc net dev.
PiperOrigin-RevId: 205253858
Change-Id: Iccdc493b66d1b4d39de44afb1184952183b1283f
2018-07-19 09:37:32 -07:00
Justine Olshan c05660373e Moved restore code out of create and made to be called after create.
Docker expects containers to be created before they are restored.
However, gVisor restoring requires specificactions regarding the kernel
and the file system. These actions were originally in booting the sandbox.

Now setting up the file system is deferred until a call to a call to
runsc start. In the restore case, the kernel is destroyed and a new kernel
is created in the same process, as we need the same process for Docker.

These changes required careful execution of concurrent processes which
required the use of a channel.

Full docker integration still needs the ability to restore into the same
container.

PiperOrigin-RevId: 205161441
Change-Id: Ie1d2304ead7e06855319d5dc310678f701bd099f
2018-07-18 16:58:30 -07:00
Nicolas Lacasse 63e2820f7b Fix lock-ordering violation in Create by logging BaseName instead of FullName.
Dirent.FullName takes the global renameMu, but can be called during Create,
which itself takes dirent.mu and dirent.dirMu, which is a lock-order violation:

Dirent.Create
  d.dirMu.Lock
  d.mu.Lock
  Inode.Create
    gofer.inodeOperations.Create
      gofer.NewFile
        Dirent.FullName
          d.renameMu.RLock

We only use the FullName here for logging, and in this case we can get by with
logging only the BaseName.

A `BaseName` method was added to Dirent, which simply returns the name, taking
d.parent.mu as required.

In the Create pathway, we can't call d.BaseName() because taking d.parent.mu
after d.mu violates the lock order. But we already know the base name of the
file we just created, so that's OK.

In the Open/GetFile pathway, we are free to call d.BaseName() because the other
dirent locks are not held.

PiperOrigin-RevId: 205112278
Change-Id: Ib45c734081aecc9b225249a65fa8093eb4995f10
2018-07-18 11:49:50 -07:00
Michael Pratt 733ebe7c09 Merge FileMem.usage in IncRef
Per the doc, usage must be kept maximally merged. Beyond that, it is simply a
good idea to keep fragmentation in usage to a minimum.

The glibc malloc allocator allocates one page at a time, potentially causing
lots of fragmentation. However, those pages are likely to have the same number
of references, often making it possible to merge ranges.

PiperOrigin-RevId: 204960339
Change-Id: I03a050cf771c29a4f05b36eaf75b1a09c9465e14
2018-07-17 13:03:59 -07:00
Neel Natu ed2e03d378 Add API to decode 'stat.st_rdev' into major and minor numbers.
PiperOrigin-RevId: 204936533
Change-Id: Ib060920077fc914f97c4a0548a176d1368510c7b
2018-07-17 10:50:53 -07:00
Zhaozhong Ni beb89bb757 netstack: update goroutine save / restore safety comments.
PiperOrigin-RevId: 204930314
Change-Id: Ifc4c41ed28616cd57fafbf7c92e87141a945c41f
2018-07-17 10:15:00 -07:00
Adin Scannell 29e00c943a Add CPUID faulting for ptrace and KVM.
PiperOrigin-RevId: 204858314
Change-Id: I8252bf8de3232a7a27af51076139b585e73276d4
2018-07-16 22:02:58 -07:00
Michael Pratt 14d06064d2 Start allocation and reclaim scans only where they may find a match
If usageSet is heavily fragmented, findUnallocatedRange and findReclaimable
can spend excessive cycles linearly scanning the set for unallocated/free
pages.

Improve common cases by beginning the scan only at the first page that could
possibly contain an unallocated/free page. This metadata only guarantees that
there is no lower unallocated/free page, but a scan may still be required
(especially for multi-page allocations).

That said, this heuristic can still provide significant performance
improvements for certain applications.

PiperOrigin-RevId: 204841833
Change-Id: Ic41ad33bf9537ecd673a6f5852ab353bf63ea1e6
2018-07-16 18:19:01 -07:00
Neel Natu 8f21c0bb28 Add EventOperations.HostFD()
This method allows an eventfd inside the Sentry to be registered with with
the host kernel.

Update comment about memory mapping host fds via CachingInodeOperations.

PiperOrigin-RevId: 204784859
Change-Id: I55823321e2d84c17ae0f7efaabc6b55b852ae257
2018-07-16 12:20:05 -07:00
Neel Natu 5b09ec3b89 Allow a filesystem to control its visibility in /proc/filesystems.
PiperOrigin-RevId: 204508520
Change-Id: I09e5f8b6e69413370e1a0d39dbb7dc1ee0b6192d
2018-07-13 12:10:57 -07:00
Michael Pratt f09ebd9c71 Note that Mount errors do not require translations
PiperOrigin-RevId: 204490639
Change-Id: I0fe26306bae9320c6aa4f854fe0ef25eebd93233
2018-07-13 10:24:18 -07:00
Michael Pratt a28b274abb Fix aio eventfd lookup
We're failing to set eventFile in the outer scope.

PiperOrigin-RevId: 204392995
Change-Id: Ib9b04f839599ef552d7b5951d08223e2b1d5f6ad
2018-07-12 17:14:50 -07:00
Zhaozhong Ni 1cd46c8dd1 sentry: wait for restore clock instead of panicing in Timekeeper.
PiperOrigin-RevId: 204372296
Change-Id: If1ed9843b93039806e0c65521f30177dc8036979
2018-07-12 15:09:02 -07:00
Zhaozhong Ni bb41ad808a sentry: save inet stacks in proc files.
PiperOrigin-RevId: 204362791
Change-Id: If85ea7442741e299f0d7cddbc3d6b415e285da81
2018-07-12 14:19:04 -07:00
Zhaozhong Ni 45c50eb124 netstack: save tcp endpoint accepted channel directly.
PiperOrigin-RevId: 204356873
Change-Id: I5e2f885f58678e693aae1a69e8bf8084a685af28
2018-07-12 13:49:21 -07:00
Zhaozhong Ni cc34a90fb4 netstack: do not defer panicable logic in tcp main loop.
PiperOrigin-RevId: 204355026
Change-Id: I1a8229879ea3b58aa861a4eb4456fd7aff99863d
2018-07-12 13:39:28 -07:00
Michael Pratt 41e0b977e5 Format documentation
PiperOrigin-RevId: 204323728
Change-Id: I1ff9aa062ffa12583b2e38ec94c87db7a3711971
2018-07-12 10:37:21 -07:00
Bhasker Hariharan c15cb8d432 Automated rollback of changelist 203157739
PiperOrigin-RevId: 204196916
Change-Id: If632750fc6368acb835e22cfcee0ae55c8a04d16
2018-07-11 15:07:19 -07:00
Jamie Liu b9c469f372 Move ptrace constants to abi/linux.
PiperOrigin-RevId: 204188763
Change-Id: I5596ab7abb3ec9e210a7f57b3fc420e836fa43f3
2018-07-11 14:24:19 -07:00
Jamie Liu ee0ef506d4 Add MemoryManager.Pin.
PiperOrigin-RevId: 204162313
Change-Id: Ib0593dde88ac33e222c12d0dca6733ef1f1035dc
2018-07-11 11:52:09 -07:00
Michael Pratt 9cd69c2f3d Internal change
PiperOrigin-RevId: 204028082
Change-Id: I4251cce10aace43f9b9a80c36204ef66f1b329df
2018-07-10 15:55:10 -07:00