Commit Graph

2366 Commits

Author SHA1 Message Date
Jamie Liu 492229d017 VFS2 gofer client
Updates #1198

Opening host pipes (by spinning in fdpipe) and host sockets is not yet
complete, and will be done in a future CL.

Major differences from VFS1 gofer client (sentry/fs/gofer), with varying levels
of backportability:

- "Cache policies" are replaced by InteropMode, which control the behavior of
  timestamps in addition to caching. Under InteropModeExclusive (analogous to
  cacheAll) and InteropModeWritethrough (analogous to cacheAllWritethrough),
  client timestamps are *not* written back to the server (it is not possible in
  9P or Linux for clients to set ctime, so writing back client-authoritative
  timestamps results in incoherence between atime/mtime and ctime). Under
  InteropModeShared (analogous to cacheRemoteRevalidating), client timestamps
  are not used at all (remote filesystem clocks are authoritative). cacheNone
  is translated to InteropModeShared + new option
  filesystemOptions.specialRegularFiles.

- Under InteropModeShared, "unstable attribute" reloading for permission
  checks, lookup, and revalidation are fused, which is feasible in VFS2 since
  gofer.filesystem controls path resolution. This results in a ~33% reduction
  in RPCs for filesystem operations compared to cacheRemoteRevalidating. For
  example, consider stat("/foo/bar/baz") where "/foo/bar/baz" fails
  revalidation, resulting in the instantiation of a new dentry:

  VFS1 RPCs:
  getattr("/")                          // fs.MountNamespace.FindLink() => fs.Inode.CheckPermission() => gofer.inodeOperations.check() => gofer.inodeOperations.UnstableAttr()
  walkgetattr("/", "foo") = fid1        // fs.Dirent.walk() => gofer.session.Revalidate() => gofer.cachePolicy.Revalidate()
  clunk(fid1)
  getattr("/foo")                       // CheckPermission
  walkgetattr("/foo", "bar") = fid2     // Revalidate
  clunk(fid2)
  getattr("/foo/bar")                   // CheckPermission
  walkgetattr("/foo/bar", "baz") = fid3 // Revalidate
  clunk(fid3)
  walkgetattr("/foo/bar", "baz") = fid4 // fs.Dirent.walk() => gofer.inodeOperations.Lookup
  getattr("/foo/bar/baz")               // linux.stat() => gofer.inodeOperations.UnstableAttr()

  VFS2 RPCs:
  getattr("/")                          // gofer.filesystem.walkExistingLocked()
  walkgetattr("/", "foo") = fid1        // gofer.filesystem.stepExistingLocked()
  clunk(fid1)
                                        // No getattr: walkgetattr already updated metadata for permission check
  walkgetattr("/foo", "bar") = fid2
  clunk(fid2)
  walkgetattr("/foo/bar", "baz") = fid3
                                        // No clunk: fid3 used for new gofer.dentry
                                        // No getattr: walkgetattr already updated metadata for stat()

- gofer.filesystem.unlinkAt() does not require instantiation of a dentry that
  represents the file to be deleted. Updates #898.

- gofer.regularFileFD.OnClose() skips Tflushf for regular files under
  InteropModeExclusive, as it's nonsensical to request a remote file flush
  without flushing locally-buffered writes to that remote file first.

- Symlink targets are cached when InteropModeShared is not in effect.

- p9.QID.Path (which is already required to be unique for each file within a
  server, and is accordingly already synthesized from device/inode numbers in
  all known gofers) is used as-is for inode numbers, rather than being mapped
  along with attr.RDev in the client to yet another synthetic inode number.

- Relevant parts of fsutil.CachingInodeOperations are inlined directly into
  gofer package code. This avoids having to duplicate part of its functionality
  in fsutil.HostMappable.

PiperOrigin-RevId: 293190213
2020-02-04 11:29:22 -08:00
Fabricio Voznika d7cd484091 Add support for sentry internal pipe for gofer mounts
Internal pipes are supported similarly to how internal UDS is done.
It is also controlled by the same flag.

Fixes #1102

PiperOrigin-RevId: 293150045
2020-02-04 08:20:52 -08:00
Andrei Vagin f37e913a35 seccomp: allow to filter syscalls by instruction pointer
PiperOrigin-RevId: 293029446
2020-02-03 16:16:18 -08:00
Brad Burlage 6cd7901d7d Add 1 Kokoro job per runtime test.
PiperOrigin-RevId: 293019326
2020-02-03 15:56:57 -08:00
Ting-Yu Wang e7846e50f2 Reduce run time for //test/syscalls:socket_inet_loopback_test_runsc_ptrace.
* Tests are picked for a shard differently. It now picks one test from each
  block, instead of picking the whole block. This makes the same kind of tests
  spreads across different shards.

* Reduce the number of connect() calls in TCPListenClose.

PiperOrigin-RevId: 293019281
2020-02-03 15:42:21 -08:00
Brad Burlage 80ce7f2537 Tag version_test as noguitar.
PiperOrigin-RevId: 292974323
2020-02-03 12:09:52 -08:00
Eyal Soha 9742daf3c2 Add packetdrill tests that use docker.
PiperOrigin-RevId: 292973224
2020-02-03 12:04:22 -08:00
Michael Pratt 4d1a648c7c Allow mlock in system call filters
Go 1.14 has a workaround for a Linux 5.2-5.4 bug which requires mlock'ing the g
stack to prevent register corruption. We need to allow this syscall until it is
removed from Go.

PiperOrigin-RevId: 292967478
2020-02-03 11:39:51 -08:00
Ian Gudger 02997af5ab Fix method comment to match method name.
PiperOrigin-RevId: 292624867
2020-01-31 15:09:13 -08:00
Adin Scannell 04cccaaeee Fix logic around AMD/Intel cases.
If the support is Ignored, then the call is still executed. We
simply rely on it to fall through to the int3. Therefore, we
must also bail on the vendor check.

PiperOrigin-RevId: 292620558
2020-01-31 14:45:47 -08:00
Dean Deng 6c3072243d Implement file locks for regular tmpfs files in VFSv2.
Add a file lock implementation that can be embedded into various filesystem
implementations.

Updates #1480

PiperOrigin-RevId: 292614758
2020-01-31 14:15:41 -08:00
Ghanan Gowripalan 77bf586db7 Use multicast Ethernet address for multicast NDP
As per RFC 2464 section 7, an IPv6 packet with a multicast destination
address is transmitted to the mapped Ethernet multicast address.

Test:
- ipv6.TestLinkResolution
- stack_test.TestDADResolve
- stack_test.TestRouterSolicitation
PiperOrigin-RevId: 292610529
2020-01-31 13:55:46 -08:00
Ghanan Gowripalan 528dd1ec72 Extract multicast IP to Ethernet address mapping
Test: header.TestEthernetAddressFromMulticastIPAddress
PiperOrigin-RevId: 292604649
2020-01-31 13:25:48 -08:00
gVisor bot bc3a24d627 Internal change.
PiperOrigin-RevId: 292587459
2020-01-31 13:19:42 -08:00
Ting-Yu Wang 7c118f7e19 KVM platform does not support 32bit.
Fixes: //test/syscalls:32bit_test_runsc_kvm
Ref change: 5d569408ef
PiperOrigin-RevId: 292563926
2020-01-31 10:07:45 -08:00
Adin Scannell 14959250fe Simplify testing link rules.
PiperOrigin-RevId: 292458933
2020-01-30 17:49:17 -08:00
gVisor bot af8f6f83a3 Merge pull request #1471 from xiaobo55x:syscall_test
PiperOrigin-RevId: 292445329
2020-01-30 16:12:25 -08:00
Jay Zhuang 9988cf2eef Wrap all GetSocketPairs() in unnamed namespaces
This avoids conflicting definitions of GetSocketPairs() in outer namespace when
multiple such cc files are complied for one binary.

PiperOrigin-RevId: 292420885
2020-01-30 14:17:58 -08:00
gVisor bot d62362f63f Merge pull request #1630 from xiaobo55x:kOLargeFile
PiperOrigin-RevId: 292419699
2020-01-30 14:03:22 -08:00
Bhasker Hariharan 4ee64a248e Fix for panic in endpoint.Close().
When sending a RST on shutdown we need to double check the
state after acquiring the work mutex as the endpoint could
have transitioned out of a connected state from the time
we checked it and we acquired the workMutex.

I added two tests but sadly neither reproduce the panic. I am
going to leave the tests in as they are good to have anyway.

PiperOrigin-RevId: 292393800
2020-01-30 12:00:35 -08:00
gVisor bot 757b2b87fe Merge pull request #1288 from lubinszARM:pr_ring0_6
PiperOrigin-RevId: 292369598
2020-01-30 10:01:31 -08:00
Michael Pratt ede8dfab37 Enforce splice offset limits
Splice must not allow negative offsets. Writes also must not allow offset +
size to overflow int64. Reads are similarly broken, but not just in splice
(b/148095030).

Reported-by: syzbot+0e1ff0b95fb2859b4190@syzkaller.appspotmail.com
PiperOrigin-RevId: 292361208
2020-01-30 09:14:31 -08:00
Ghanan Gowripalan ec0679737e Do not include the Source Link Layer option with an unspecified source address
When sending NDP messages with an unspecified source address, the Source
Link Layer address must not be included.

Test: stack_test.TestDADResolve
PiperOrigin-RevId: 292341334
2020-01-30 07:12:52 -08:00
Ghanan Gowripalan 6f841c304d Do not spawn a goroutine when calling stack.NDPDispatcher's methods
Do not start a new goroutine when calling
stack.NDPDispatcher.OnDuplicateAddressDetectionStatus.

PiperOrigin-RevId: 292268574
2020-01-29 19:55:00 -08:00
Kevin Krakauer 0ade523f06 Fix iptables tests that were broken by rename.
The name of the runner binary target changed from "runner" to "runner-image",
causing iptables tests to fail.

PiperOrigin-RevId: 292242263
2020-01-29 16:27:12 -08:00
Bhasker Hariharan 51b783505b Add support for TCP_DEFER_ACCEPT.
PiperOrigin-RevId: 292233574
2020-01-29 15:53:45 -08:00
Dean Deng 148fda60e8 Add plumbing for file locks in VFS2.
Updates #1480

PiperOrigin-RevId: 292180192
2020-01-29 11:39:28 -08:00
Andrei Vagin 37bb502670 sentry: rename SetRSEQInterruptedIP to SetOldRSeqInterruptedIP for arm64
For amd64, this has been done on cl/288342928.

PiperOrigin-RevId: 292170856
2020-01-29 10:47:28 -08:00
Jamie Liu 8dcedc953a Add //pkg/sentry/devices/memdev.
PiperOrigin-RevId: 292165063
2020-01-29 10:09:31 -08:00
Bin Lu 6adbdfe232 supporting sError in guest kernel on Arm64
For test case 'TestBounce', we use KVM_SET_VCPU_EVENTS to trigger sError
to leave guest.

Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-01-29 07:50:38 -05:00
Dean Deng 4cb55a7a3b Prevent arbitrary size allocation when sending UDS messages.
Currently, Send() will copy data into a new byte slice without regard to the
original size. Size checks should be performed before the allocation takes
place.

Note that for the sake of performance, we avoid putting the buffer
allocation into the critical section. As a result, the size checks need to be
performed again within Enqueue() in case the limit has changed.

PiperOrigin-RevId: 292058147
2020-01-28 18:46:14 -08:00
Fabricio Voznika 396c574db2 Add support for WritableSource in DynamicBytesFileDescriptionImpl
WritableSource is a convenience interface used for files that can
be written to, e.g. /proc/net/ipv4/tpc_sack. It reads max of 4KB
and only from offset 0 which should cover most cases. It can be
extended as neeed.

Updates #1195

PiperOrigin-RevId: 292056924
2020-01-28 18:31:28 -08:00
Fabricio Voznika 3d046fef06 Changes missing in last submit
Updates #1487
Updates #1623

PiperOrigin-RevId: 292040835
2020-01-28 16:53:55 -08:00
Ghanan Gowripalan 431ff52768 Update link address for senders of Neighbor Solicitations
Update link address for senders of NDP Neighbor Solicitations when the NS
contains an NDP Source Link Layer Address option.

Tests:
- ipv6.TestNeighorSolicitationWithSourceLinkLayerOption
- ipv6.TestNeighorSolicitationWithInvalidSourceLinkLayerOption
PiperOrigin-RevId: 292028553
2020-01-28 15:45:36 -08:00
Fabricio Voznika 437c986c6a Add vfs.FileDescription to FD table
FD table now holds both VFS1 and VFS2 types and uses the correct
one based on what's set.

Parts of this CL are just initial changes (e.g. sys_read.go,
runsc/main.go) to serve as a template for the remaining changes.

Updates #1487
Updates #1623

PiperOrigin-RevId: 292023223
2020-01-28 15:31:03 -08:00
Jamie Liu 2862b0b1be Add //pkg/sentry/fsimpl/devtmpfs.
PiperOrigin-RevId: 292021389
2020-01-28 15:05:24 -08:00
Ghanan Gowripalan ce0bac4be9 Include the NDP Source Link Layer option when sending DAD messages
Test: stack_test.TestDADResolve
PiperOrigin-RevId: 292003124
2020-01-28 13:52:04 -08:00
Andrei Vagin f263801a74 fs/splice: don't report partial errors for special files
Special files can have additional requirements for granularity.
For example, read from eventfd returns EINVAL if a size is less 8 bytes.

Reported-by: syzbot+3905f5493bec08eb7b02@syzkaller.appspotmail.com
PiperOrigin-RevId: 292002926
2020-01-28 13:37:19 -08:00
Jamie Liu 34fbd8446c Add VFS2 support for epoll.
PiperOrigin-RevId: 291997879
2020-01-28 13:11:43 -08:00
Jianfeng Tan d99329e584 netlink: add support for RTM_F_LOOKUP_TABLE
Test command:
  $ ip route get 1.1.1.1

Fixes: #1099

Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/1121 from tanjianfeng:fix-1099 e6919f3d4ede5aa51a48b3d2be0d7a4b482dd53d
PiperOrigin-RevId: 291990716
2020-01-28 12:32:59 -08:00
Jamie Liu 1119644080 Implement an anon_inode equivalent for VFS2.
PiperOrigin-RevId: 291986033
2020-01-28 12:08:00 -08:00
Michael Pratt 76483b8b1e Check sigsetsize in rt_sigaction
This isn't in the libc wrapper, but it is in the syscall itself.

Discovered by @xiaobo55x in #1625.

PiperOrigin-RevId: 291973931
2020-01-28 11:26:09 -08:00
Michael Pratt 74e04506a4 Prefer Type& over Type &
And Type* over Type *. This is basically a whitespace only change.

gVisor code already prefers left-alignment of pointers and references, but
clang-format formats for consistency with the majority of a file, and some
files leaned the wrong way. This is a one-time pass to make us completely
conforming.

Autogenerated with:

$ find . \( -name "*.cc" -or -name "*.c" -or -name "*.h" \) \
    | xargs clang-format -i -style="{BasedOnStyle: Google,  \
        DerivePointerAlignment: false, PointerAlignment: Left}"

PiperOrigin-RevId: 291972421
2020-01-28 11:18:17 -08:00
Adin Scannell 5d569408ef Create platform_util for tests.
PiperOrigin-RevId: 291869423
2020-01-27 22:28:43 -08:00
Ghanan Gowripalan 2a2da5be31 Add a type to represent the NDP Source Link Layer Address option
Tests:
- header.TestNDPSourceLinkLayerAddressOptionEthernetAddress
- header.TestNDPSourceLinkLayerAddressOptionSerialize
- header.TestNDPOptionsIterCheck
- header.TestNDPOptionsIter
PiperOrigin-RevId: 291856429
2020-01-27 20:51:28 -08:00
Adin Scannell 5776a7b6f6 Fix header ordering and format all C++ code.
PiperOrigin-RevId: 291844200
2020-01-27 18:27:20 -08:00
gVisor bot db68c85ab7 Merge pull request #1561 from zhangningdlut:chris_tty
PiperOrigin-RevId: 291821850
2020-01-27 16:35:38 -08:00
Adin Scannell 253c9e666c Cleanup glog and add real caller information.
In general, we've learned that logging must be avoided at all
costs in the hot path. It's unlikely that the optimizations
here were significant in any case, since buffer would certainly
escape.

This also adds a test to ensure that the caller identification
works as expected, and so that logging can be benchmarked.

Original:
BenchmarkGoogleLogging-6   	 1222255	       949 ns/op

With this change:
BenchmarkGoogleLogging-6   	  517323	      2346 ns/op

Fixes #184

PiperOrigin-RevId: 291815420
2020-01-27 16:08:35 -08:00
Adin Scannell 0e2f1b7abd Update package locations.
Because the abi will depend on the core types for marshalling (usermem,
context, safemem, safecopy), these need to be flattened from the sentry
directory. These packages contain no sentry-specific details.

PiperOrigin-RevId: 291811289
2020-01-27 15:31:32 -08:00
gVisor bot 60d7ff73e1 Merge pull request #1676 from majek:marek/FIX-1632-expose-NewPacketConn
PiperOrigin-RevId: 291803499
2020-01-27 15:30:32 -08:00