Commit Graph

1791 Commits

Author SHA1 Message Date
Jay Zhuang 8b461aa36b Remove redundant dep in BUILD
PiperOrigin-RevId: 301859066
2020-03-19 11:34:49 -07:00
Bhasker Hariharan 3a37f67917 Change SocketOperations.readMu to an RWMutex.
Also get rid of the readViewHasData as it's not required anymore.

Updates #231, #357

PiperOrigin-RevId: 301837227
2020-03-19 10:00:31 -07:00
Bhasker Hariharan fd27a917ef Address comments on workMu removal change.
Updates #231, #357

PiperOrigin-RevId: 301833669
2020-03-19 09:43:23 -07:00
Bhasker Hariharan e9e399c25d Remove workMu from tcpip.Endpoint.
workMu is removed and e.mu is now a mutex that supports TryLock.  The packet
processing path tries to lock the mutex and if its locked it will just queue the
packet and move on. The endpoint.UnlockUser() will process any backlog of
packets before unlocking the socket.

This simplifies the locking inside tcp endpoints a lot. Further the
endpoint.LockUser() implements spinning as long as the lock is not held by
another syscall goroutine. This ensures low latency as not spinning leads to the
task thread being put to sleep if the lock is held by the packet dispatch
path. This is suboptimal as the lower layer rarely holds the lock for long so
implementing spinning here helps.

If the lock is held by another task goroutine then we just proceed to call
LockUser() and the task could be put to sleep.

The protocol goroutines themselves just call e.mu.Lock() and block if the
lock is currently not available.

Updates #231, #357

PiperOrigin-RevId: 301808349
2020-03-19 07:19:58 -07:00
Dean Deng 3a42638a0b Port imported TTY fds to vfs2.
Refactor fs/host.TTYFileOperations so that the relevant functionality can be
shared with VFS2 (fsimpl/host.ttyFD).

Incorporate host.defaultFileFD into the default host.fileDescription. This way,
there is no need for a separate default_file.go. As in vfs1, the TTY file
implementation can be built on top of this default and override operations as
necessary (PRead/Read/PWrite/Write, Release, Ioctl).

Note that these changes still need to be plumbed into runsc, which refers to
imported TTYs in control/proc.go:ExecAsync.

Updates #1672.

PiperOrigin-RevId: 301718157
2020-03-18 19:12:10 -07:00
Andrei Vagin c3cee7f5a4 Deflake third_party/gvisor/pkg/gate/gate_test
TestConcurrentAll executes 1000 goroutines which never sleep,
so they are not preempted by Go's runtime. In Go 1.14, async preemption
has been added, but the added runtime.Gosched() call will do nothing
wrong in this case too.

PiperOrigin-RevId: 301705712
2020-03-18 17:42:29 -07:00
gVisor bot a0fed7ea45 Merge pull request #2061 from lubinszARM:pr_restart_syscall
PiperOrigin-RevId: 301700868
2020-03-18 17:11:43 -07:00
Ian Gudger 92a00ca91a Store segment transmit count.
This will aid in segment reordering detection.

Updates #691

PiperOrigin-RevId: 301692638
2020-03-18 16:26:36 -07:00
Fabricio Voznika f1d1af2a4a Fix FDTable.NewFDVFS2
It was looking at VFS1 table to determine where to
allocate the next FD from.

Updates #1035

PiperOrigin-RevId: 301678858
2020-03-18 15:13:42 -07:00
Bhasker Hariharan c29d4fc59e Automated rollback of changelist 301501607
PiperOrigin-RevId: 301578043
2020-03-18 06:36:43 -07:00
Bhasker Hariharan eddd6ce514 Wrap rand.Reader in a bufio.Reader.
rand.Read() results in a syscall to the host on every call instead
we can wrap it with a bufio.Reader to buffer and reduce number of syscalls.
This is especially important for TCP where every newly created endpoint
reads random data to initialize the timestamp offsets for the endpoint.

Updates #231

PiperOrigin-RevId: 301501607
2020-03-17 19:10:53 -07:00
Zach Koopmans 42d78ba61b Remove HostFS from Sentry.
PiperOrigin-RevId: 301402181
2020-03-17 10:30:32 -07:00
Eyal Soha 3192e55ffe Packetimpact in Go with c++ stub
PiperOrigin-RevId: 301382690
2020-03-17 08:53:27 -07:00
Andrei Vagin b55f0e5d40 fdtable: don't try to zap fdtable entry if close is called for non-existing fd
FDTable.setAll is used to zap entries, but it grows the table up to
a specified fd.

Reported-by: syzbot+9e281b0750d2d4caa190@syzkaller.appspotmail.com
PiperOrigin-RevId: 301280000
2020-03-16 18:29:58 -07:00
Fabricio Voznika 2a6c4369be Enforce file size rlimits in VFS2
Updates #1035

PiperOrigin-RevId: 301255357
2020-03-16 16:00:49 -07:00
Fabricio Voznika 0f60799a4f Add calls to vfs.CheckSetStat to fsimpls
Only gofer filesystem was calling vfs.CheckSetStat for
vfs.FilesystemImpl.SetStatAt and vfs.FileDescriptionImpl.SetStat.

Updates #1193, #1672, #1197

PiperOrigin-RevId: 301226522
2020-03-16 13:29:12 -07:00
Ting-Yu Wang 69da42885a Enable ARP resolution in TAP devices.
PiperOrigin-RevId: 301208471
2020-03-16 12:03:27 -07:00
gVisor bot 159a230b9b Merge pull request #1943 from kevinGC:ipt-filter-ip
PiperOrigin-RevId: 301197007
2020-03-16 11:13:14 -07:00
Bhasker Hariharan 52758e16e0 Prevent vnetHdr from escaping in WritePacket.
PiperOrigin-RevId: 301157950
2020-03-16 08:03:27 -07:00
Fabricio Voznika 9712775028 Disallow kernfs.Inode.SetStat for readonly inodes
Updates #1195, #1193

PiperOrigin-RevId: 300950993
2020-03-14 13:48:06 -07:00
Dean Deng 5e413cad10 Plumb VFS2 imported fds into virtual filesystem.
- When setting up the virtual filesystem, mount a host.filesystem to contain
  all files that need to be imported.
- Make read/preadv syscalls to the host in cases where preadv2 may not be
  supported yet (likewise for writing).
- Make save/restore functions in kernel/kernel.go return early if vfs2 is
  enabled.

PiperOrigin-RevId: 300922353
2020-03-14 07:14:33 -07:00
Fabricio Voznika 45a8ae240d Add remaining procfs files
Closes #1195

PiperOrigin-RevId: 300867055
2020-03-13 18:57:07 -07:00
Fabricio Voznika 829beebf0b Panic if file in FDTable has been destroyed
This will give more information about the file to
identify where possibly the extra DecRef()
would be.

PiperOrigin-RevId: 300855874
2020-03-13 17:18:10 -07:00
Jamie Liu b0f2c3e764 Fix infinite loop in semaphore.sem.wakeWaiters().
PiperOrigin-RevId: 300845134
2020-03-13 16:09:18 -07:00
Michael Pratt 6d4497de25 Fix typo
PiperOrigin-RevId: 300832988
2020-03-13 15:02:42 -07:00
Ghanan Gowripalan 645b1b2e9c Refactor SLAAC address state into SLAAC prefix state
Previously, SLAAC related state was stored on a per-address basis. This was
sufficient for the simple case of a single SLAAC address per prefix, but
future CLs will introduce temporary addresses which will result in multiple
SLAAC addresses for a prefix. This refactor allows storing multiple addresses
for a prefix in a single SLAAC prefix state.

No behaviour changes - existing tests continue to pass.

PiperOrigin-RevId: 300832812
2020-03-13 14:59:19 -07:00
Jamie Liu 1c05352970 Fix oom_score_adj.
- Make oomScoreAdj a ThreadGroup field (Linux: signal_struct::oom_score_adj).

- Avoid deadlock caused by Task.OOMScoreAdj()/SetOOMScoreAdj() locking Task.mu
  and TaskSet.mu in the wrong order (via Task.ExitState()).

PiperOrigin-RevId: 300814698
2020-03-13 13:19:13 -07:00
Ghanan Gowripalan 530a31f3c0 Disable a NIC before removing it
When a NIC is removed, attempt to disable the NIC first to cleanup
dynamic state and stop ongoing periodic tasks (e.g. IPv6 router
solicitations, DAD) so that a removed NIC does not attempt to send
packets.

Tests:
    - stack_test.TestRemoveUnknownNIC
    - stack_test.TestRemoveNIC
    - stack_test.TestDADStop
    - stack_test.TestCleanupNDPState
    - stack_test.TestRouteWithDownNIC
    - stack_test.TestStopStartSolicitingRouters
PiperOrigin-RevId: 300805857
2020-03-13 12:30:16 -07:00
Jamie Liu 86409c9181 Avoid unnecessary work in transportDemuxer.deliverPacket().
- Don't allocate []*endpointsByNic in transportDemuxer.deliverPacket() unless
  actually needed for UDP broadcast/multicast.

- Don't allocate []*endpointsByNic via transportDemuxer.findEndpointLocked()
  => transportDemuxer.findAllEndpointsLocked().

- Skip unnecessary map lookups in transportDemuxer.findEndpointLocked() =>
  transportDemuxer.findAllEndpointsLocked() (now iterEndpointsLocked).

For most deliverable packets other than UDP broadcast/multicast packets, this
saves two slice allocations and three map lookups per packet.

PiperOrigin-RevId: 300804135
2020-03-13 12:22:19 -07:00
Jamie Liu b78cee3bae Fix lock recursion in kernel.ProcessGroup.SendSignal().
PiperOrigin-RevId: 300803515
2020-03-13 12:18:36 -07:00
Dean Deng 2e38408f20 Implement access/faccessat for VFS2.
Note that the raw faccessat system call does not actually take a flags argument;
according to faccessat(2), the glibc wrapper implements the flags by using
fstatat(2). Remove the flag argument that we try to extract from vfs1, which
would just be a garbage value.

Updates #1965
Fixes #2101

PiperOrigin-RevId: 300796067
2020-03-13 11:41:08 -07:00
Ting-Yu Wang f458a325e9 Fix "application exiting with {Code:0 Signo:27}" during boot.
2aa9514a06 skips SIGURG, but later code expects
the sigchans array contains consecutive signal numbers.

PiperOrigin-RevId: 300793450
2020-03-13 11:26:45 -07:00
Ghanan Gowripalan 28d26d2c4f Honour the link's MaxHeaderLength when forwarding
LinkEndpoints may expect/assume that the a tcpip.PacketBuffer's Header
has enough capacity for its own headers, as per documentation for
LinkEndpoint.MaxHeaderLength.

Test: stack_test.TestNICForwarding
PiperOrigin-RevId: 300784192
2020-03-13 10:44:23 -07:00
Fabricio Voznika 8f8f16efaf Add support for mount flags
Plumbs MS_NOEXEC and MS_RDONLY. Others are TODO.

Updates #1623 #1193

PiperOrigin-RevId: 300764669
2020-03-13 08:58:04 -07:00
Eyal Soha f693e1334b Clarify comments about IHL in ipv4.go.
PiperOrigin-RevId: 300668506
2020-03-12 18:39:40 -07:00
Zach Koopmans 919664600d Mark gonet_test as flaky.
Mark /pkg/tcpip/adapters/gonet/gonet_test as flaky.

PiperOrigin-RevId: 300609529
2020-03-12 13:11:48 -07:00
Bin Lu 7df936f359 passed the syscall test case 'alarm' on Arm64 platform
This issue was caused by 'restart_syscall'.
The value of Register R0 should be stored after finishing sysemu.
So that we can restore the value and restart syscall.

Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-03-12 05:57:47 -04:00
Tamir Duberstein 035f7434e9 Use a heap in transport demuxer
...instead of sorting at various times. Plug a memory leak by setting
removed elements to nil.

PiperOrigin-RevId: 300471087
2020-03-11 21:13:46 -07:00
Tamir Duberstein ac05043525 Implement heap.Interface on pointer receiver
PiperOrigin-RevId: 300467253
2020-03-11 20:38:05 -07:00
Tamir Duberstein 538e35f61b Fix race condition (*tcp.endpoint).Close
Atomically close the endpoint. Before this change, it was possible for
multiple callers to perform duplicate work.

PiperOrigin-RevId: 300462110
2020-03-11 19:57:25 -07:00
Adin Scannell 61051f2268 Clean-up buffer implementation.
This also adds substantial test cases.

The Read/Write interfaces are dropped as they are not necessary.

PiperOrigin-RevId: 300461547
2020-03-11 19:52:14 -07:00
Bhasker Hariharan 81675b850e Fix memory leak in danglingEndpoints.
Endpoints which were being terminated in an ERROR state or were moved to CLOSED
by the worker goroutine do not run cleanupLocked() as that should already be run
by the worker termination. But when making that change we made the mistake of
not removing the endpoint from the danglingEndpoints which is normally done in
cleanupLocked().

As a result these endpoints are leaked since a reference is held to them in the
danglingEndpoints array forever till Stack is torn down.

PiperOrigin-RevId: 300438426
2020-03-11 17:03:57 -07:00
Andrei Vagin 22d89ef5cb Import "unsafe" in bluepill_arm64_unsafe.go
This fixes a compile time error:
pkg/sentry/platform/kvm/bluepill_arm64_unsafe.go:45:35: undefined: unsafe

PiperOrigin-RevId: 300375687
2020-03-11 12:01:46 -07:00
gVisor bot 2c2622b942 Merge pull request #1975 from nybidari:iptables
PiperOrigin-RevId: 300362789
2020-03-11 11:02:04 -07:00
Andrei Vagin 2aa9514a06 runsc: don't redirect SIGURG which is used by Go's runtime scheduler
Go 1.14+ sends SIGURG to Ms to attempt asynchronous preemption of a G. Since it
can't guarantee that a SIGURG is only related to preemption, it continues to
forward them to signal.Notify (see runtime.sighandler).

When runsc is running a container, there are three processes: a parent process
and two children (sandbox and gopher). A parent process sets a signal handler
for all signals and redirect them to the container init process. This logic
should ignore SIGURG signals. We already ignore them in the Sentry, but it will
be better to not notify about them when this is possible.

PiperOrigin-RevId: 300345286
2020-03-11 09:50:06 -07:00
gVisor bot 7bca09107b Automated rollback of changelist 300217972
PiperOrigin-RevId: 300308974
2020-03-11 06:08:56 -07:00
gVisor bot 24e7005ab6 Merge pull request #1832 from xiaobo55x:tls_ptrace
PiperOrigin-RevId: 300270894
2020-03-11 01:06:19 -07:00
Ghanan Gowripalan f56fe66b13 Honour the link's MaxHeaderLength when forwarding
This change also updates where the IP packet buffer is held in an
outbound tcpip.PacketBuffer from Header to Data. This change removes
unncessary copying of the IP packet buffer when forwarding.

Test: stack_test.TestNICForwarding
PiperOrigin-RevId: 300217972
2020-03-10 17:52:31 -07:00
gVisor bot d6440ec5a1 The packet forwarding should resolve the link address if necessary.
Fixes #1510

Test:
- stack_test.TestForwardingWithStaticResolver
- stack_test.TestForwardingWithFakeResolver
- stack_test.TestForwardingWithNoResolver
- stack_test.TestForwardingWithFakeResolverPartialTimeout
- stack_test.TestForwardingWithFakeResolverTwoPackets
- stack_test.TestForwardingWithFakeResolverManyPackets
- stack_test.TestForwardingWithFakeResolverManyResolutions
PiperOrigin-RevId: 300182570
2020-03-10 14:50:13 -07:00
Ting-Yu Wang b36de6e7be Move /proc/net to /proc/PID/net, and make /proc/net -> /proc/self/net.
Issue #1833

PiperOrigin-RevId: 299998105
2020-03-09 19:59:09 -07:00
Haibo Xu c04958e2fa Enable thread local storage support on arm64.
Linux use the task.thread.uw.tp_value field to store the
TLS pointer on arm64 platform, and we use a similar way
in gvisor to store it in the arch/State struct.

Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: Ie76b5c6d109bc27ccfd594008a96753806db7764
2020-03-09 01:04:55 +00:00
Dean Deng 228813fd26 Update comments and debug level for profiling options.
PiperOrigin-RevId: 299448307
2020-03-06 15:23:46 -08:00
Dean Deng 960f6a975b Add plumbing for importing fds in VFS2, along with non-socket, non-TTY impl.
In VFS2, imported file descriptors are stored in a kernfs-based filesystem.
Upon calling ImportFD, the host fd can be accessed in two ways:
1. a FileDescription that can be added to the FDTable, and
2. a Dentry in the host.filesystem mount, which we will want to access through
magic symlinks in /proc/[pid]/fd/.

An implementation of the kernfs.Inode interface stores a unique host fd. This
inode can be inserted into file descriptions as well as dentries.

This change also plumbs in three FileDescriptionImpls corresponding to fds for
sockets, TTYs, and other files (only the latter is implemented here).
These implementations will mostly make corresponding syscalls to the host.
Where possible, the logic is ported over from pkg/sentry/fs/host.

Updates #1672

PiperOrigin-RevId: 299417263
2020-03-06 12:59:49 -08:00
Tamir Duberstein 6fa5cee82c Prevent memory leaks in ilist
When list elements are removed from a list but not discarded, it becomes
important to invalidate the references they hold to their former
neighbors to prevent memory leaks.

PiperOrigin-RevId: 299412421
2020-03-06 12:31:43 -08:00
gVisor bot 18d41cf153 Merge pull request #1963 from xiaobo55x:kvm_common
PiperOrigin-RevId: 299405855
2020-03-06 12:05:30 -08:00
gVisor bot 56c4272568 Merge pull request #1946 from xiaobo55x:dieTramp
PiperOrigin-RevId: 299405663
2020-03-06 12:01:23 -08:00
Eyal Soha d5dbe366bf shutdown(s, SHUT_WR) in TIME-WAIT returns ENOTCONN
From RFC 793 s3.9 p61 Event Processing:

CLOSE Call during TIME-WAIT: return with "error: connection closing"

Fixes #1603

PiperOrigin-RevId: 299401353
2020-03-06 11:42:34 -08:00
Ghanan Gowripalan f50d9a31e9 Specify the source of outgoing NDP RS
If the NIC has a valid IPv6 address assigned, use it as the
source address for outgoing NDP Router Solicitation packets.

Test: stack_test.TestRouterSolicitation
PiperOrigin-RevId: 299398763
2020-03-06 11:33:28 -08:00
Nayana Bidari 1e8c0bcedb Add nat table support for iptables. 2020-03-06 09:25:32 -08:00
Ghanan Gowripalan d6f5e71df2 Get strings for stack.DHCPv6ConfigurationFromNDPRA
Useful for logs to print the string representation of the value
instead of the integer value.

PiperOrigin-RevId: 299356847
2020-03-06 08:02:45 -08:00
Ian Lewis da48fc6cca Stub oom_score_adj and oom_score.
Adds an oom_score_adj and oom_score proc file stub. oom_score_adj accepts
writes of values -1000 to 1000 and persists the value with the task. New tasks
inherit the parent's oom_score_adj.

oom_score is a read-only stub that always returns the value '0'.

Issue #202

PiperOrigin-RevId: 299245355
2020-03-05 18:23:01 -08:00
Ting-Yu Wang 9b64b658c1 Fix S/R on inet.Namespace.
PiperOrigin-RevId: 299238067
2020-03-05 17:40:18 -08:00
gVisor bot 6367963c14 Merge pull request #1951 from moricho:moricho/add-profiler-option
PiperOrigin-RevId: 299233818
2020-03-05 17:16:54 -08:00
Ian Gudger 9b3aad33c4 Use a pool of arrays to avoid slice headers from escaping in TCP options pool.
By putting slices into the pool, the slice header escapes. This can be avoided
by not putting the slice header into the pool.

This removes an allocation from the TCP segment send path.

PiperOrigin-RevId: 299215480
2020-03-05 15:56:42 -08:00
Andrei Vagin 80b40bbb06 tests: Don't print log messages on stdout
A parser of test results doesn't expect to see any extra messages.

PiperOrigin-RevId: 298966577
2020-03-04 16:16:35 -08:00
Jamie Liu a690b57624 Ensure that safemem.BlockSeqOf(safemem.Block{}) produces an empty BlockSeq.
PiperOrigin-RevId: 298941855
2020-03-04 14:30:27 -08:00
Fabricio Voznika 122d47aed1 Update cached file size when cache is skipped
gofer.dentryReadWriter.WriteFromBlocks was not updating
gofer.dentry.size after a write operation that skips the
cache.

Updates #1198

PiperOrigin-RevId: 298708646
2020-03-03 15:29:13 -08:00
Tamir Duberstein 371abe00f0 Avoid memory leaks
Properly discard segments from the segment heap.

PiperOrigin-RevId: 298704074
2020-03-03 15:07:09 -08:00
Andrei Vagin 277a0d5a1f platform/ptrace: don't call probeSeccomp on arm64
The support of PTRACE_SYSEMU on arm64 was added in the 5.3 kernel,
so we can be sure that the current version is higher that 5.3.

And this change moves vsyscall seccomp rules to the arch specific file,
because vsyscall isn't supported on arm64.

PiperOrigin-RevId: 298696493
2020-03-03 14:35:42 -08:00
Tamir Duberstein 844e4d284c Extract local variables for readability
PiperOrigin-RevId: 298690552
2020-03-03 14:11:01 -08:00
Ian Gudger c15b8515eb Fix datarace on TransportEndpointInfo.ID and clean up semantics.
Ensures that all access to TransportEndpointInfo.ID is either:
* In a function ending in a Locked suffix.
* While holding the appropriate mutex.

This primary affects the checkV4Mapped method on affected endpoints, which has
been renamed to checkV4MappedLocked. Also document the method and change its
argument to be a value instead of a pointer which had caused some awkwardness.

This race was possible in the udp and icmp endpoints between Connect and uses
of TransportEndpointInfo.ID including in both itself and Bind.

The tcp endpoint did not suffer from this bug, but benefited from better
documentation.

Updates #357

PiperOrigin-RevId: 298682913
2020-03-03 13:42:13 -08:00
Nayana Bidari 43abb24657 Fix panic caused by invalid address for Bind in packet sockets.
PiperOrigin-RevId: 298476533
2020-03-02 16:31:52 -08:00
Bhasker Hariharan 3310175250 Fix data-race when reading/writing e.amss.
PiperOrigin-RevId: 298451319
2020-03-02 14:45:03 -08:00
Ghanan Gowripalan 8821a7104f Do not read-lock NIC recursively
A deadlock may occur if a write lock on a RWMutex is blocked between
nested read lock attempts as the inner read lock attempt will be
blocked in this scenario.

Example (T1 and T2 are differnt goroutines):
  T1: obtain read-lock
  T2: attempt write-lock (blocks)
  T1: attempt inner/nested read-lock (blocks)

Here we can see that T1 and T2 are deadlocked.

Tests: Existing tests pass.
PiperOrigin-RevId: 298426678
2020-03-02 13:16:10 -08:00
gVisor bot f03e19d575 Merge pull request #1885 from avagin:arm64-pcids
PiperOrigin-RevId: 298405064
2020-03-02 11:42:04 -08:00
Andrei Vagin 42fb7d3491 socket: take readMu to access readView
DATA RACE in netstack.(*SocketOperations).fetchReadView

Write at 0x00c001dca138 by goroutine 1001:
  gvisor.dev/gvisor/pkg/sentry/socket/netstack.(*SocketOperations).fetchReadView()
      pkg/sentry/socket/netstack/netstack.go:418 +0x85
  gvisor.dev/gvisor/pkg/sentry/socket/netstack.(*SocketOperations).coalescingRead()
      pkg/sentry/socket/netstack/netstack.go:2309 +0x67
  gvisor.dev/gvisor/pkg/sentry/socket/netstack.(*SocketOperations).nonBlockingRead()
      pkg/sentry/socket/netstack/netstack.go:2378 +0x183d

Previous read at 0x00c001dca138 by goroutine 1111:
  gvisor.dev/gvisor/pkg/sentry/socket/netstack.(*SocketOperations).Ioctl()
      pkg/sentry/socket/netstack/netstack.go:2666 +0x533
  gvisor.dev/gvisor/pkg/sentry/syscalls/linux.Ioctl()

Reported-by: syzbot+d4c3885fcc346f08deb6@syzkaller.appspotmail.com
PiperOrigin-RevId: 298387377
2020-03-02 10:33:15 -08:00
Michael Pratt 62bd3ca8a3 Take write lock when removing xattr
PiperOrigin-RevId: 298380654
2020-03-02 10:07:13 -08:00
gVisor bot 3d9ddeb339 Merge pull request #1929 from avagin:arm64-cpuid
PiperOrigin-RevId: 297982488
2020-02-28 18:47:17 -08:00
Andrei Vagin ab7ecdd66d watchdog: print panic error message before other messages
This is needed for syzkaller to proper classify issues.

Right now, all watchdog issues are duped to one with the
subject "panic: Sentry detected stuck task(s). See stack
trace and message above for more details".

PiperOrigin-RevId: 297975363
2020-02-28 17:54:36 -08:00
Andrei Vagin 413a9b7fdc Define CPUIDInstruction for arm64
There is no cpuid instruction on arm64, so we need to defined it
just to avoid a compile time error.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-02-28 17:07:01 -08:00
Andrei Vagin 837cf62551 pcids.go isn't arch-specific
Signed-off-by: Andrei Vagin <avagin@gmail.com>
2020-02-28 14:34:13 -08:00
Adin Scannell 463f4217d1 Make pipe buffer implementation standard.
A follow-up change will convert the networking code to use this standard
pipe implementation.

PiperOrigin-RevId: 297903206
2020-02-28 12:29:23 -08:00
Ting-Yu Wang 6b4d36e325 Hide /dev/net/tun when using hostinet.
/dev/net/tun does not currently work with hostinet. This has caused some
program starts failing because it thinks the feature exists.

PiperOrigin-RevId: 297876196
2020-02-28 10:39:12 -08:00
Fabricio Voznika 0f8a9e3623 Change dup2 call to dup3
We changed syscalls to allow dup3 for ARM64.

Updates #1198

PiperOrigin-RevId: 297870816
2020-02-28 10:15:20 -08:00
Nayana Bidari af6fab6514 Add nat table support for iptables.
- Fix review comments.
2020-02-28 10:00:38 -08:00
Ian Gudger c6bdc6b05b Fix a race in TCP endpoint teardown and teardown the stack in tcp_test.
Call stack.Close on stacks when we are done with them in tcp_test. This avoids
leaking resources and reduces the test's flakiness when race/gotsan is enabled.
It also provides test coverage for the race also fixed in this change, which
can be reliably triggered with the stack.Close change (and without the other
changes) when race/gotsan is enabled.

The race was possible when calling Abort (via stack.Close) on an endpoint
processing a SYN segment as part of a passive connect.

Updates #1564

PiperOrigin-RevId: 297685432
2020-02-27 14:15:44 -08:00
gVisor bot d9ee81183f Merge of a369c88c0c
PiperOrigin-RevId: 297674924
2020-02-27 13:34:23 -08:00
Nayana Bidari abf7ebcd38 Internal change.
PiperOrigin-RevId: 297638665
2020-02-27 11:00:41 -08:00
Bin Lu 5f0e8e6239 Prepare the vcpu environment for sentry on Arm64
Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-02-27 01:19:28 -05:00
Rahat Mahmood 8fb84f78ad Fix construct of linux.Stat for arm64.
PiperOrigin-RevId: 297494373
2020-02-26 19:29:27 -08:00
gVisor bot 6ddeb35ed4 Merge pull request #1912 from lubinszARM:pr_kvm_build
PiperOrigin-RevId: 297492004
2020-02-26 19:09:45 -08:00
Nayana Bidari 9fccf98c0d Fix merge conflicts. 2020-02-26 13:18:35 -08:00
Kevin Krakauer 408979e619 iptables: filter by IP address (and range)
Enables commands such as:
$ iptables -A INPUT -d 127.0.0.1 -j ACCEPT
$ iptables -t nat -A PREROUTING ! -d 127.0.0.1 -j REDIRECT

Also adds a bunch of REDIRECT+destination tests.
2020-02-26 11:04:00 -08:00
moricho d8ed784311 add profile option 2020-02-26 16:49:51 +09:00
Jamie Liu a92087f0f8 Add VFS.NewDisconnectedMount().
Analogous to Linux's kern_mount().

PiperOrigin-RevId: 297259580
2020-02-25 19:13:30 -08:00
Adin Scannell fba479b3c7 Fix DATA RACE in fs.MayDelete.
MayDelete must lock the directory also, otherwise concurrent renames may
race. Note that this also changes the methods to be aligned with the actual
Remove and RemoveDirectory methods to minimize confusion when reading the
code. (It was hard to see that resolution was correct.)

PiperOrigin-RevId: 297258304
2020-02-25 19:04:15 -08:00
Haibo Xu 73201f4c57 Code Clean: Move arch independent codes to common file in kvm pkg.
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: Iefbdf53e8e8d6d23ae75d8a2ff0d2a6e71f414d8
2020-02-26 01:51:31 +00:00
gVisor bot 813b1b0486 Merge pull request #1271 from lubinszARM:pr_ring0_1
PiperOrigin-RevId: 297230721
2020-02-25 16:24:43 -08:00
Ian Gudger 87288b26a1 Add netlink sockopt logging to strace.
PiperOrigin-RevId: 297220008
2020-02-25 15:35:24 -08:00
nybidari 818abc2bd5
Merge branch 'master' into iptables 2020-02-25 15:33:59 -08:00
Ghanan Gowripalan 5f1f9dd9d2 Use link-local source address for link-local multicast
Tests:
- header_test.TestIsV6LinkLocalMulticastAddress
- header_test.TestScopeForIPv6Address
- stack_test.TestIPv6SourceAddressSelectionScopeAndSameAddress
PiperOrigin-RevId: 297215576
2020-02-25 15:16:16 -08:00
Nayana Bidari acc405ba60 Add nat table support for iptables.
- commit the changes for the comments.
2020-02-25 15:03:51 -08:00
Fabricio Voznika 72e3f3a3ee Add option to skip stuck tasks waiting for address space
PiperOrigin-RevId: 297192390
2020-02-25 13:44:18 -08:00
gVisor bot 430992a67a Merge pull request #1816 from xiaobo55x:trap_flag
PiperOrigin-RevId: 297191168
2020-02-25 13:41:05 -08:00
Jamie Liu 471b15b212 Port most syscalls to VFS2.
pipe and pipe2 aren't ported, pending a slight rework of pipe FDs for VFS2.
mount and umount2 aren't ported out of temporary laziness. access and faccessat
need additional FSImpl methods to implement properly, but are stubbed to
prevent googletest from CHECK-failing. Other syscalls require additional
plumbing.

Updates #1623

PiperOrigin-RevId: 297188448
2020-02-25 13:37:34 -08:00
Adin Scannell 6def8ea6ac Fix nested logging.
PiperOrigin-RevId: 297175316
2020-02-25 12:25:38 -08:00
Adin Scannell 98b693e61b Don't acquire contended lock with the OS thread locked.
Fixes #1049

PiperOrigin-RevId: 297175164
2020-02-25 12:22:29 -08:00
Adin Scannell 53504e29ca Fix mount refcount issue.
Each mount is holds a reference on a root Dirent, but the mount itself may
live beyond it's own reference. This means that a call to Root() can come
after the associated reference has been dropped.

Instead of introducing a separate layer of references for mount objects,
we simply change the Root() method to use TryIncRef() and allow it to return
nil if the mount is already gone. This requires updating a small number of
callers and minimizes the change (since VFSv2 will replace this code shortly).

PiperOrigin-RevId: 297174230
2020-02-25 12:17:52 -08:00
Bhasker Hariharan d7b7379251 Deflake TestCurrentConnectedIncrement.
TestCurrentConnectedIncrement fails consistently under gotsan due to the sleep
to check metrics is exactly the same as the TIME-WAIT duration. Under gotsan
things can be slow enough that the increment test is done before the protocol
goroutine is run after the TIME-WAIT timer expires and does its cleanup.

Increasing the sleep from 1s to 1.2s makes the test pass consistently.

PiperOrigin-RevId: 297160181
2020-02-25 11:19:34 -08:00
Haibo Xu 93e0c37529 Enable bluepill dieTrampoline operation on arm64.
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: I9e1bf2513c23bdd8c387e5b3c874c6ad3ca9aab0
2020-02-25 01:50:58 +00:00
Ian Gudger c37b196455 Add support for tearing down protocol dispatchers and TIME_WAIT endpoints.
Protocol dispatchers were previously leaked. Bypassing TIME_WAIT is required to
test this change.

Also fix a race when a socket in SYN-RCVD is closed. This is also required to
test this change.

PiperOrigin-RevId: 296922548
2020-02-24 10:32:17 -08:00
Ting-Yu Wang b8f56c79be Implement tap/tun device in vfs.
PiperOrigin-RevId: 296526279
2020-02-21 15:42:56 -08:00
Ghanan Gowripalan a155a23480 Attach LinkEndpoint to NetworkDispatcher immediately
Tests: stack_test.TestAttachToLinkEndpointImmediately
PiperOrigin-RevId: 296474068
2020-02-21 11:21:23 -08:00
Ghanan Gowripalan 97c07242c3 Use Route.MaxHeaderLength when constructing NDP RS
Test: stack_test.TestRouterSolicitation
PiperOrigin-RevId: 296454766
2020-02-21 09:54:55 -08:00
gVisor bot 4a73bae269 Initial network namespace support.
TCP/IP will work with netstack networking. hostinet doesn't work, and sockets
will have the same behavior as it is now.

Before the userspace is able to create device, the default loopback device can
be used to test.

/proc/net and /sys/net will still be connected to the root network stack; this
is the same behavior now.

Issue #1833

PiperOrigin-RevId: 296309389
2020-02-20 15:20:40 -08:00
gVisor bot 67b615b86f Support disabling a NIC
- Disabled NICs will have their associated NDP state cleared.
- Disabled NICs will not accept incoming packets.
- Writes through a Route with a disabled NIC will return an invalid
  endpoint state error.
- stack.Stack.FindRoute will not return a route with a disabled NIC.
- NIC's Running flag will report the NIC's enabled status.

Tests:
- stack_test.TestDisableUnknownNIC
- stack_test.TestDisabledNICsNICInfoAndCheckNIC
- stack_test.TestRoutesWithDisabledNIC
- stack_test.TestRouteWritePacketWithDisabledNIC
- stack_test.TestStopStartSolicitingRouters
- stack_test.TestCleanupNDPState
- stack_test.TestAddRemoveIPv4BroadcastAddressOnNICEnableDisable
- stack_test.TestJoinLeaveAllNodesMulticastOnNICEnableDisable
PiperOrigin-RevId: 296298588
2020-02-20 14:32:49 -08:00
gVisor bot d90d71474f Remove bytes read/written from marshal.Marshallable API.
Users of the API only care about whether the copy in/out succeeds in
their entirety, which is already signalled by the returned error.

PiperOrigin-RevId: 296297843
2020-02-20 14:29:26 -08:00
gVisor bot 9bad87339a Better strace logging for epoll syscalls.
Example:

epoll_ctl(0x3 anon_inode:[eventpoll], EPOLL_CTL_ADD, 0x6 anon_inode:[eventfd], 0x7efe2fd92a80 {events=EPOLLIN|EPOLLOUT data=0x10203040506070a}) = 0x0 (4.411µs)

epoll_wait(0x3 anon_inode:[eventpoll], 0x7efe2fd92b50 {{events=EPOLLOUT data=0x102030405060708}{events=EPOLLOUT data=0x102030405060708}{events=EPOLLOUT data=0x102030405060708}}, 0x3, 0xffffffff) = 0x3 (29.891µs)

PiperOrigin-RevId: 296258146
2020-02-20 11:31:00 -08:00
gVisor bot 9a4e3e63ef Re-add atomicbitops_arm64.s to BUILD.
This was inadverently dropped by cl/295811743.

PiperOrigin-RevId: 296254482
2020-02-20 11:16:08 -08:00
gVisor bot 10ed60e477 VFS2: Support memory mapping in tmpfs.
tmpfs.fileDescription now implements ConfigureMMap. And tmpfs.regularFile
implement memmap.Mappable. The methods are mostly unchanged from VFS1 tmpfs.

PiperOrigin-RevId: 296234557
2020-02-20 09:58:10 -08:00
Bin Lu a369c88c0c Lazy-fpsimd support patch series#1: add Arm64-fpsimd support to arch module
This patch defines the structures and
adds the implementations for fpsimd initialization.

Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-02-20 07:46:30 -05:00
Bin Lu de68e1d8c4 Code Clean:Move getUserRegisters into dieArchSetup() and other small changes.
Consistent with QEMU, getUserRegisters() should be an arch-specific
function. So, it should be called in dieArchSetup().

With this patch and the pagetable/pcid patch, the kvm modules on Arm64 can be
built successfully.

Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-02-20 06:43:27 +00:00
gVisor bot 2daa21e4d7 Internal change.
PiperOrigin-RevId: 296088213
2020-02-19 16:48:57 -08:00
Kevin Krakauer 92d2d78876 Fix mis-named comment. 2020-02-18 21:20:41 -08:00
gVisor bot 56fd9504aa Enable IPV6_RECVTCLASS socket option for datagram sockets
Added the ability to get/set the IP_RECVTCLASS socket option on UDP endpoints.
If enabled, traffic class from the incoming Network Header passed as ancillary
data in the ControlMessages.

Adding Get/SetSockOptBool to decrease the overhead of getting/setting simple
options. (This was absorbed in a CL that will be landing before this one).

Test:
* Added unit test to udp_test.go that tests getting/setting as well as
verifying that we receive expected TOS from incoming packet.
* Added a syscall test for verifying getting/setting
* Removed test skip for existing syscall test to enable end to end test.
PiperOrigin-RevId: 295840218
2020-02-18 15:45:36 -08:00
gVisor bot 55c553ae8c Add //pkg/syncevent.
Package syncevent is intended to subsume ~all uses of channels in the sentry
(including //pkg/waiter), as well as //pkg/sleep.

Compared to channels:

- Delivery of events to a syncevent.Receiver allows *synchronous* execution of
  an arbitrary callback, whereas delivery of events to a channel requires a
  goroutine to receive from that channel, resulting in substantial scheduling
  overhead. (This is also part of the motivation for the waiter package.)

- syncevent.Waiter can wait on multiple event sources without the high O(N)
  overhead of select. (This is the same motivation as for the sleep package.)

Compared to the waiter package:

- syncevent.Waiters are intended to be persistent (i.e. per-kernel.Task), and
  syncevent.Broadcaster (analogous to waiter.Queue) is a hash table rather than
  a linked list, such that blocking is (usually) allocation-free.

- syncevent.Source (analogous to waiter.Waitable) does not include an equivalent
  to waiter.Waitable.Readiness(), since this is inappropriate for transient
  events (see e.g. //pkg/sentry/kernel/time.ClockEventSource).

Compared to the sleep package:

- syncevent events are represented by bits in a bitmask rather than discrete
  sleep.Waker objects, reducing overhead and making it feasible to broadcast
  events to multiple syncevent.Receivers.

- syncevent.Receiver invokes an arbitrary callback, which is required by the
  sentry's epoll implementation. (syncevent.Waiter, which is analogous to
  sleep.Sleeper, pairs a syncevent.Receiver with a callback that wakes a
  waiting goroutine; the implementation of this aspect is nearly identical to
  that of sleep.Sleeper, except that it represents *runtime.g as unsafe.Pointer
  rather than uintptr.)

- syncevent.Waiter.Wait (analogous to sleep.Sleeper.Fetch(block=true)) does not
  automatically un-assert returned events. This is useful in cases where the
  path for handling an event is not the same as the path that observes it, such
  as for application signals (a la Linux's TIF_SIGPENDING).

- Unlike sleep.Sleeper, which Fetches Wakers in the order that they were
  Asserted, the event bitmasks used by syncevent.Receiver have no way of
  preserving event arrival order. (This is similar to select, which goes out of
  its way to randomize event ordering.)

The disadvantage of the syncevent package is that, since events are represented
by bits in a uint64 bitmask, each syncevent.Receiver can "only" multiplex
between 64 distinct events; this does not affect any known use case.

Benchmarks:

BenchmarkBroadcasterSubscribeUnsubscribe
BenchmarkBroadcasterSubscribeUnsubscribe-12         	45133884	        26.3 ns/op
BenchmarkMapSubscribeUnsubscribe
BenchmarkMapSubscribeUnsubscribe-12                 	28504662	        41.8 ns/op
BenchmarkQueueSubscribeUnsubscribe
BenchmarkQueueSubscribeUnsubscribe-12               	22747668	        45.6 ns/op
BenchmarkBroadcasterSubscribeUnsubscribeBatch
BenchmarkBroadcasterSubscribeUnsubscribeBatch-12    	31609177	        37.8 ns/op
BenchmarkMapSubscribeUnsubscribeBatch
BenchmarkMapSubscribeUnsubscribeBatch-12            	17563906	        62.1 ns/op
BenchmarkQueueSubscribeUnsubscribeBatch
BenchmarkQueueSubscribeUnsubscribeBatch-12          	26248838	        46.6 ns/op
BenchmarkBroadcasterBroadcastRedundant
BenchmarkBroadcasterBroadcastRedundant/0
BenchmarkBroadcasterBroadcastRedundant/0-12         	100907563	        11.8 ns/op
BenchmarkBroadcasterBroadcastRedundant/1
BenchmarkBroadcasterBroadcastRedundant/1-12         	85103068	        13.3 ns/op
BenchmarkBroadcasterBroadcastRedundant/4
BenchmarkBroadcasterBroadcastRedundant/4-12         	52716502	        22.3 ns/op
BenchmarkBroadcasterBroadcastRedundant/16
BenchmarkBroadcasterBroadcastRedundant/16-12        	20278165	        58.7 ns/op
BenchmarkBroadcasterBroadcastRedundant/64
BenchmarkBroadcasterBroadcastRedundant/64-12        	 5905428	       205 ns/op
BenchmarkMapBroadcastRedundant
BenchmarkMapBroadcastRedundant/0
BenchmarkMapBroadcastRedundant/0-12                 	87532734	        13.5 ns/op
BenchmarkMapBroadcastRedundant/1
BenchmarkMapBroadcastRedundant/1-12                 	28488411	        36.3 ns/op
BenchmarkMapBroadcastRedundant/4
BenchmarkMapBroadcastRedundant/4-12                 	19628920	        60.9 ns/op
BenchmarkMapBroadcastRedundant/16
BenchmarkMapBroadcastRedundant/16-12                	 6026980	       192 ns/op
BenchmarkMapBroadcastRedundant/64
BenchmarkMapBroadcastRedundant/64-12                	 1640858	       754 ns/op
BenchmarkQueueBroadcastRedundant
BenchmarkQueueBroadcastRedundant/0
BenchmarkQueueBroadcastRedundant/0-12               	96904807	        12.0 ns/op
BenchmarkQueueBroadcastRedundant/1
BenchmarkQueueBroadcastRedundant/1-12               	73521873	        16.3 ns/op
BenchmarkQueueBroadcastRedundant/4
BenchmarkQueueBroadcastRedundant/4-12               	39209468	        31.2 ns/op
BenchmarkQueueBroadcastRedundant/16
BenchmarkQueueBroadcastRedundant/16-12              	10810058	       105 ns/op
BenchmarkQueueBroadcastRedundant/64
BenchmarkQueueBroadcastRedundant/64-12              	 2998046	       376 ns/op
BenchmarkBroadcasterBroadcastAck
BenchmarkBroadcasterBroadcastAck/1
BenchmarkBroadcasterBroadcastAck/1-12               	44472397	        26.4 ns/op
BenchmarkBroadcasterBroadcastAck/4
BenchmarkBroadcasterBroadcastAck/4-12               	17653509	        69.7 ns/op
BenchmarkBroadcasterBroadcastAck/16
BenchmarkBroadcasterBroadcastAck/16-12              	 4082617	       260 ns/op
BenchmarkBroadcasterBroadcastAck/64
BenchmarkBroadcasterBroadcastAck/64-12              	 1220534	      1027 ns/op
BenchmarkMapBroadcastAck
BenchmarkMapBroadcastAck/1
BenchmarkMapBroadcastAck/1-12                       	26760705	        44.2 ns/op
BenchmarkMapBroadcastAck/4
BenchmarkMapBroadcastAck/4-12                       	11495636	       100 ns/op
BenchmarkMapBroadcastAck/16
BenchmarkMapBroadcastAck/16-12                      	 2937590	       343 ns/op
BenchmarkMapBroadcastAck/64
BenchmarkMapBroadcastAck/64-12                      	  861037	      1344 ns/op
BenchmarkQueueBroadcastAck
BenchmarkQueueBroadcastAck/1
BenchmarkQueueBroadcastAck/1-12                     	19832679	        55.0 ns/op
BenchmarkQueueBroadcastAck/4
BenchmarkQueueBroadcastAck/4-12                     	 5618214	       189 ns/op
BenchmarkQueueBroadcastAck/16
BenchmarkQueueBroadcastAck/16-12                    	 1569980	       713 ns/op
BenchmarkQueueBroadcastAck/64
BenchmarkQueueBroadcastAck/64-12                    	  437672	      2814 ns/op
BenchmarkWaiterNotifyRedundant
BenchmarkWaiterNotifyRedundant-12                   	650823090	         1.96 ns/op
BenchmarkSleeperNotifyRedundant
BenchmarkSleeperNotifyRedundant-12                  	619871544	         1.61 ns/op
BenchmarkChannelNotifyRedundant
BenchmarkChannelNotifyRedundant-12                  	298903778	         3.67 ns/op
BenchmarkWaiterNotifyWaitAck
BenchmarkWaiterNotifyWaitAck-12                     	68358360	        17.8 ns/op
BenchmarkSleeperNotifyWaitAck
BenchmarkSleeperNotifyWaitAck-12                    	25044883	        41.2 ns/op
BenchmarkChannelNotifyWaitAck
BenchmarkChannelNotifyWaitAck-12                    	29572404	        40.2 ns/op
BenchmarkSleeperMultiNotifyWaitAck
BenchmarkSleeperMultiNotifyWaitAck-12               	16122969	        73.8 ns/op
BenchmarkWaiterTempNotifyWaitAck
BenchmarkWaiterTempNotifyWaitAck-12                 	46111489	        25.8 ns/op
BenchmarkSleeperTempNotifyWaitAck
BenchmarkSleeperTempNotifyWaitAck-12                	15541882	        73.6 ns/op
BenchmarkWaiterNotifyWaitMultiAck
BenchmarkWaiterNotifyWaitMultiAck-12                	65878500	        18.2 ns/op
BenchmarkSleeperNotifyWaitMultiAck
BenchmarkSleeperNotifyWaitMultiAck-12               	28798623	        41.5 ns/op
BenchmarkChannelNotifyWaitMultiAck
BenchmarkChannelNotifyWaitMultiAck-12               	11308468	       101 ns/op
BenchmarkWaiterNotifyAsyncWaitAck
BenchmarkWaiterNotifyAsyncWaitAck-12                	 2475387	       492 ns/op
BenchmarkSleeperNotifyAsyncWaitAck
BenchmarkSleeperNotifyAsyncWaitAck-12               	 2184507	       518 ns/op
BenchmarkChannelNotifyAsyncWaitAck
BenchmarkChannelNotifyAsyncWaitAck-12               	 2120365	       562 ns/op
BenchmarkWaiterNotifyAsyncWaitMultiAck
BenchmarkWaiterNotifyAsyncWaitMultiAck-12           	 2351247	       494 ns/op
BenchmarkSleeperNotifyAsyncWaitMultiAck
BenchmarkSleeperNotifyAsyncWaitMultiAck-12          	 2205799	       522 ns/op
BenchmarkChannelNotifyAsyncWaitMultiAck
BenchmarkChannelNotifyAsyncWaitMultiAck-12          	 1238079	       928 ns/op

Updates #1074

PiperOrigin-RevId: 295834087
2020-02-18 15:18:48 -08:00
gVisor bot a3582de618 cpuid: cache the maximum size of xsave state
perf shows that ExtendedStateSize cosumes more than 20% of cpu:

    23.61%    23.61%  [.] pkg/cpuid/cpuid.HostID

PiperOrigin-RevId: 295813263
2020-02-18 13:50:07 -08:00
gVisor bot 906eb6295d atomicbitops package cleanups
- Redocument memory ordering from "no ordering" to "acquire-release". (No
  functional change: both LOCK WHATEVER on x86, and LDAXR/STLXR loops on ARM64,
  already have this property.)

- Remove IncUnlessZeroInt32 and DecUnlessOneInt32, which were only faster than
  the equivalent loops using sync/atomic before the Go compiler inlined
  non-unsafe.Pointer atomics many releases ago.

PiperOrigin-RevId: 295811743
2020-02-18 13:43:28 -08:00
gVisor bot 7fdb609b3e Merge pull request #1850 from kevinGC:jump2
PiperOrigin-RevId: 295785052
2020-02-18 11:41:54 -08:00
Nayana Bidari b30b7f3422 Add nat table support for iptables.
Add nat table support for Prerouting hook with Redirect option.
Add tests to check redirect of ports.
2020-02-18 11:30:42 -08:00
gVisor bot fae3de21af ring0/pagetables: fix typo
PiperOrigin-RevId: 295770717
2020-02-18 10:50:46 -08:00
gVisor bot a5069f820f Remove linux.EpollEvent.Fd.
glibc defines struct epoll_event in such a way that epoll_event.data.fd exists.
However, the kernel's definition of struct epoll_event makes epoll_event.data
an opaque uint64, so naming half of it "fd" just introduces confusion. Remove
the Fd field, and make Data a [2]int32 to compensate.

Also add required padding to linux.EpollEvent on ARM64.

PiperOrigin-RevId: 295250424
2020-02-14 16:19:48 -08:00
gVisor bot 5baf9dc2fb Synchronize signalling with S/R
This is to fix a data race between sending an external signal to
a ThreadGroup and kernel saving state for S/R.

PiperOrigin-RevId: 295244281
2020-02-14 15:49:09 -08:00
gVisor bot 3557b26651 Allow vfs.IterDirentsCallback.Handle() to return an error.
This is easier than storing errors from e.g. CopyOut in the callback.

PiperOrigin-RevId: 295230021
2020-02-14 14:40:35 -08:00
gVisor bot 87bc2834c9 Enable automated marshalling for RSeqCriticalSection.
PiperOrigin-RevId: 295226468
2020-02-14 14:24:27 -08:00
gVisor bot e4c7f3e6f6 Inline vfs.VirtualFilesystem in Kernel struct
This saves one pointer dereference per VFS access.

Updates #1623

PiperOrigin-RevId: 295216176
2020-02-14 13:40:39 -08:00
gVisor bot 50c493193b Un-export p9 message encode/decode functions.
These are not used outside of the p9 package.

PiperOrigin-RevId: 295200052
2020-02-14 12:23:10 -08:00
gVisor bot 3c26f5ecb0 Enable automated marshalling for struct stat.
This requires fixing a few build issues for non-am64 platforms.

PiperOrigin-RevId: 295196922
2020-02-14 12:08:12 -08:00
gVisor bot 4075de11be Plumb VFS2 inside the Sentry
- Added fsbridge package with interface that can be used to open
  and read from VFS1 and VFS2 files.
- Converted ELF loader to use fsbridge
- Added VFS2 types to FSContext
- Added vfs.MountNamespace to ThreadGroup

Updates #1623

PiperOrigin-RevId: 295183950
2020-02-14 11:12:47 -08:00
gVisor bot b2e86906ea Fix various issues related to enabling go-marshal.
- Add missing build tags to files in the abi package.

- Add the marshal package as a sentry dependency, allowed by deps_test.

- Fix an issue with our top-level go_library BUILD rule, which
  incorrectly shadows the variable containing the input set of source
  files. This caused the expansion for the go_marshal clause to
  silently omit input files.

- Fix formatting when copying build tags to gomarshal-generated files.

- Fix a bug with import statement collision detection in go-marshal.

PiperOrigin-RevId: 295112284
2020-02-14 03:27:34 -08:00
Bin Lu ebaf29abeb passed the kvm test case of "TestKernelSyscall" on Arm64
For kvm test case "TestKernelSyscall",
redpill/syscall(-1) in guest kernel level will be trapped in el1_svc.
And in el1_svc, we use mmio_exit to leave the guest.

Signed-off-by: Bin Lu <bin.lu@arm.com>
2020-02-14 02:04:44 -05:00
gVisor bot a6024f7f5f Add FileExec flag to OpenOptions
This allow callers to say whether the file is being
opened to be executed, so that the proper checks can
be done from FilesystemImpl.OpenAt()

Updates #1623

PiperOrigin-RevId: 295042595
2020-02-13 17:57:36 -08:00
Kevin Krakauer 6ef63cd7da We can now create and jump in iptables. For example:
$ iptables -N foochain
$ iptables -A INPUT -j foochain
2020-02-13 17:02:50 -08:00
gVisor bot 16308b9dc1 Merge pull request #1791 from kevinGC:uchains
PiperOrigin-RevId: 294957297
2020-02-13 11:19:09 -08:00
gVisor bot 69bf39e8a4 Internal change.
PiperOrigin-RevId: 294952610
2020-02-13 10:59:52 -08:00
Haibo Xu d30a884775 Add definition of arch.ARMTrapFlag.
Fixes #1708

Signed-off-by: Haibo Xu haibo.xu@arm.com
Change-Id: Ib15768692ead17c81c06f7666ca3f0a14064c3a0
2020-02-13 00:25:16 +00:00
Kevin Krakauer 6fdf2c53a1 iptables: User chains
- Adds creation of user chains via `-N <chainname>`
- Adds `-j RETURN` support for built-in chains, which triggers the
  chain's underflow rule (usually the default policy).
- Adds tests for chain creation, default policies, and `-j RETURN' from
  built-in chains.
2020-02-12 15:02:47 -08:00
gVisor bot 5205bc7e58 Simplify atomic operations
PiperOrigin-RevId: 294582802
2020-02-11 20:37:01 -08:00
gVisor bot 6dced977ea Ensure fsimpl/gofer.dentryPlatformFile.hostFileMapper is initialized.
Fixes #1812. (The more direct cause of the deadlock is panic unsafety because
the historically high cost of defer means that we avoid it in hot paths,
including much of MM; defer is much cheaper as of Go 1.14, but still a
measurable overhead.)

PiperOrigin-RevId: 294560316
2020-02-11 17:38:57 -08:00
gVisor bot b8e22e241c Disallow duplicate NIC names.
PiperOrigin-RevId: 294500858
2020-02-11 12:59:11 -08:00