Commit Graph

1118 Commits

Author SHA1 Message Date
gVisor bot 4aeedd47bf internal BUILD file cleanup.
PiperOrigin-RevId: 270680704
2019-09-23 08:25:13 -07:00
Jamie Liu fb55c2bd0d Change vfs.Dirent.Off to NextOff.
"d_off is the distance from the start of the directory to the start of the next
linux_dirent." - getdents(2).

PiperOrigin-RevId: 270349685
2019-09-20 14:24:29 -07:00
Ian Gudger 002f1d4aae Allow waiting for LinkEndpoint worker goroutines to finish.
Previously, the only safe way to use an fdbased endpoint was to leak the FD.
This change makes it possible to safely close the FD.

This is the first step towards having stoppable stacks.

Updates #837

PiperOrigin-RevId: 270346582
2019-09-20 14:10:02 -07:00
Jianfeng Tan 223481e927 fix set hostname
Previously, when we set hostname:

$ strace hostname abc
...
sethostname("abc", 3) = -1 ENAMETOOLONG (File name too long)
...

According to man 2 sethostname:

"The len argument specifies the number of bytes in name. (Thus, name
does not require a terminating null byte.)"

We wrongly use the CopyStringIn() to check terminating zero byte in
the implementation of sethostname syscall.

To fix this, we use CopyInBytes() instead.

Fixes: #861

Reported-by: chenglang.hy <chenglang.hy@antfin.com>
Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
2019-09-20 17:57:25 +00:00
Jianfeng Tan 329b6653ff Implement /proc/net/tcp6
Fixes: #829

Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
Signed-off-by: Jielong Zhou <jielong.zjl@antfin.com>
2019-09-20 17:20:08 +00:00
Jamie Liu e9af227a61 Fix p9 integration of flipcall.
- Do not call Rread.SetPayload(flipcall packet window) in p9.channel.recv().

- Ignore EINTR from ppoll() in p9.Client.watch().

- Clean up handling of client socket FD lifetimes so that p9.Client.watch()
  never ppoll()s a closed FD.

- Make p9test.Harness.Finish() call clientSocket.Shutdown() instead of
  clientSocket.Close() for the same reason.

- Rework channel reuse to avoid leaking channels in the following case (suppose
  we have two channels):

  sendRecvChannel
    len(channels) == 2 => idx = 1
    inuse[1] = ch0
                                        sendRecvChannel
                                          len(channels) == 1 => idx = 0
                                          inuse[0] = ch1
    inuse[1] = nil
  sendRecvChannel
    len(channels) == 1 => idx = 0
    inuse[0] = ch0
                                          inuse[0] = nil
    inuse[0] == nil => ch0 leaked

- Avoid deadlocking p9.Client.watch() by calling channelsWg.Wait() without
  holding channelsMu.

- Bump p9test:client_test size to medium.

PiperOrigin-RevId: 270200314
2019-09-19 22:52:56 -07:00
Robert Tonic 46beb91912 Fix documentation, clean up seccomp filter installation, rename helpers.
Filter installation has been streamlined and functions renamed. 
Documentation has been fixed to be standards compliant, and missing 
documentation added. gofmt has also been applied to modified files.
2019-09-19 17:10:50 -04:00
Adin Scannell 75781ab3ef Remove defer from hot path and ensure Atomic is applied consistently.
PiperOrigin-RevId: 270114317
2019-09-19 13:39:32 -07:00
gVisor bot 1c0324d5a1 Merge pull request #876 from xiaobo55x:hostcpu
PiperOrigin-RevId: 270094324
2019-09-19 12:03:38 -07:00
Kevin Krakauer 0a8a75f3da Job control: controlling TTYs and foreground process groups.
Adresses a deadlock with the rolled back change:
b6a5b950d2
Creating a session from an orphaned process group was causing a lock to be
acquired twice by a single goroutine. This behavior is addressed, and a test
(OrphanRegression) has been added to pty.cc.

Implemented the following ioctls:
- TIOCSCTTY - set controlling TTY
- TIOCNOTTY - remove controlling tty, maybe signal some other processes
- TIOCGPGRP - get foreground process group. Also enables tcgetpgrp().
- TIOCSPGRP - set foreground process group. Also enabled tcsetpgrp().

Next steps are to actually turn terminal-generated control characters (e.g. C^c)
into signals to the proper process groups, and to send SIGTTOU and SIGTTIN when
appropriate.

PiperOrigin-RevId: 270088599
2019-09-19 11:36:47 -07:00
Hang Su d72c63664b Accelerate byte lookup in string with `bytealg/indexbyte`
`bytealg/indexbyte` will use AVX or SSE instruction set, if possible,
which could accelerate `CopyStringIn` function by 28%.

In worst case(CPU doesn't support SSE), `bytealg/indexbyte`
will degenerate to traversal lookup. When dealing with
short strings, `bytealg/indexbyte` has the same performance level as
before.

Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
Signed-off-by: Hang Su <darcy.sh@antfin.com>
2019-09-19 22:16:52 +08:00
Haibo Xu cabe10e603 Enable pkg/sentry/hostcpu support on arm64.
Signed-off-by: Haibo Xu haibo.xu@arm.com
Change-Id: I333872da9bdf56ddfa8ab2f034dfc1f36a7d3132
2019-09-18 23:51:42 +00:00
Adin Scannell c98e7f0d19 Signalfd support
Note that the exact semantics for these signalfds are slightly different from
Linux. These signalfds are bound to the process at creation time. Reads, polls,
etc. are all associated with signals directed at that task. In Linux, all
signalfd operations are associated with current, regardless of where the
signalfd originated.

In practice, this should not be an issue given how signalfds are used. In order
to fix this however, we will need to plumb the context through all the event
APIs. This gets complicated really quickly, because the waiter APIs are all
netstack-specific, and not generally exposed to the context.  Probably not
worthwhile fixing immediately.

PiperOrigin-RevId: 269901749
2019-09-18 15:16:42 -07:00
Bin Lu 38bc0b6b6a enable syscalls/linux to support arm64
Signed-off-by: Bin Lu <bin.lu@arm.com>
Change-Id: I45af8a54304f8bb0e248ab15f4e20b173ea9e430
2019-09-18 10:13:06 +00:00
Bin Lu 8e73e2cec5 enable kvm/testutil to support arm64
enable kvm/testutil to support arm64

The Arm64 user-mode execution stat consists of:
1, X0- X30
2, PC, SP, PSTATE
3, TPIDR_EL0, used for TLS
4, V0-V31: 32 128-bit registers for floating point and simd
5, FPSR

Currently, we first try to achieve goals 1 and 2.

This patch provids basic test utils for goals 1 & 2

Signed-off-by: Bin Lu <bin.lu@arm.com>
2019-09-18 09:57:59 +00:00
Ghanan Gowripalan 60fe8719e1 Automated rollback of changelist 268047073
PiperOrigin-RevId: 269658971
2019-09-17 14:47:09 -07:00
Andrei Vagin 3b7119a7c9 platform/ptrace: log exit code for stub processes
PiperOrigin-RevId: 269631877
2019-09-17 12:45:22 -07:00
Ian Gudger 747320a7aa Update remaining users of LinkEndpoints to not refer to them as an ID.
PiperOrigin-RevId: 269614517
2019-09-17 11:31:00 -07:00
Andrei Vagin 239a07aabf gvisor: return ENOTDIR from the unlink syscall
ENOTDIR has to be returned when a component used as a directory in
pathname is not, in  fact,  a directory.

PiperOrigin-RevId: 269037893
2019-09-13 21:44:57 -07:00
Adin Scannell a8834fc555 Update p9 to support flipcall.
PiperOrigin-RevId: 268845090
2019-09-12 23:37:31 -07:00
Adin Scannell 7c6ab6a219 Implement splice methods for pipes and sockets.
This also allows the tee(2) implementation to be enabled, since dup can now be
properly supported via WriteTo.

Note that this change necessitated some minor restructoring with the
fs.FileOperations splice methods. If the *fs.File is passed through directly,
then only public API methods are accessible, which will deadlock immediately
since the locking is already done by fs.Splice. Instead, we pass through an
abstract io.Reader or io.Writer, which elide locks and use the underlying
fs.FileOperations directly.

PiperOrigin-RevId: 268805207
2019-09-12 17:43:27 -07:00
Michael Pratt df5d377521 Remove go_test from go_stateify and go_marshal
They are no-ops, so the standard rule works fine.

PiperOrigin-RevId: 268776264
2019-09-12 15:10:17 -07:00
Ghanan Gowripalan 857940d30d Automated rollback of changelist 268047073
PiperOrigin-RevId: 268757842
2019-09-12 13:52:25 -07:00
Ian Gudger 9dfcd8b09f Fix ephemeral port leak.
Fix a bug where udp.(*endpoint).Disconnect [accessible in gVisor via
epsocket.(*SocketOperations).Connect with AF_UNSPEC] would leak a port
reservation if the socket/endpoint had an ephemeral port assigned to it.

glibc's getaddrinfo uses connect with AF_UNSPEC, causing each call of
getaddrinfo to leak a port. Call getaddrinfo too many times and you run out of
ports (shows up as connect returning EAGAIN and getaddrinfo returning
EAI_NONAME "Name or service not known").

PiperOrigin-RevId: 268071160
2019-09-09 14:02:00 -07:00
Rahat Mahmood 3733b9b893 go_marshal: Implement automatic generation of ABI marshalling code.
This CL implements go_marshal, a code generation utility for
automatically serializing and deserializing ABI structs.

The go_marshal tool automatically generates implementations of the new
marshal interface. Unlike binary.Marshal/Unmarshal, the generated
interface implementations use no runtime reflection, and translates to
a single memcpy for most structs. See go_marshal/README.md for
details.

PiperOrigin-RevId: 268065475
2019-09-09 13:36:39 -07:00
Ghanan Gowripalan a8943325db Join IPv6 all-nodes and solicited-node multicast addresses where appropriate.
The IPv6 all-nodes multicast address will be joined on NIC enable, and the
appropriate IPv6 solicited-node multicast address will be joined when IPv6
addresses are added.

Tests: Test receiving packets destined to the IPv6 link-local all-nodes
multicast address and the IPv6 solicted node address of an added IPv6 address.
PiperOrigin-RevId: 268047073
2019-09-09 12:06:06 -07:00
Ian Gudger fe1f521077 Remove reundant global tcpip.LinkEndpointID.
PiperOrigin-RevId: 267709597
2019-09-06 18:01:14 -07:00
Jamie Liu 9e1cbdf565 Indicate flipcall synchronization to the Go race detector.
Since each Endpoint has a distinct mapping of the packet window, the Go race
detector does not recognize accesses by connected Endpoints to be related. This
means that this change isn't necessary for the Go race detector to accept
accesses of flipcall.Endpoint.Data(), but it *is* necessary for it to accept
accesses to shared variables outside the scope of flipcall that are
synchronized by flipcall.Endpoint state; see updated test for an example.

RaceReleaseMerge is needed (instead of RaceRelease) because calls to
raceBecomeInactive() from *unrelated* Endpoints can occur in any order.
(DowngradableRWMutex.RUnlock() has a similar property: calls to RUnlock() on
the same DowngradableRWMutex from different goroutines can occur in any order.
Remove the TODO asking to explain this now that this is understood.)

PiperOrigin-RevId: 267705325
2019-09-06 17:25:07 -07:00
Nicolas Lacasse 7e94f171f4 Better strace logs for statx.
PiperOrigin-RevId: 267498537
2019-09-05 18:03:53 -07:00
Robert Tonic 4573efe84b Switch from net to unet to open Unix Domain Sockets. 2019-09-05 07:16:36 -04:00
Bhasker Hariharan 3dc3cffb2d Fix RST generation bugs.
There are a few cases addressed by this change

- We no longer generate a RST in response to a RST packet.

- When we receive a RST we cleanup and release all reservations immediately as
  the connection is now aborted.

- An ACK received by a listening socket generates a RST when SYN cookies are not
  in-use. The only reason an ACK should land at the listening socket is if we
  are using SYN cookies otherwise the goroutine for the handshake in progress
  should have gotten the packet and it should never have arrived at the
  listening endpoint.

- Also fixes the error returned when a connection times out due to a
  Keepalive timer expiration from ECONNRESET to a ETIMEDOUT.

PiperOrigin-RevId: 267238427
2019-09-04 14:59:53 -07:00
Chris Kuiper 7bf1d426d5 Handle subnet and broadcast addresses correctly with NIC.subnets
This also renames "subnet" to "addressRange" to avoid any more confusion with
an interface IP's subnet.

Lastly, this also removes the Stack.ContainsSubnet(..) API since it isn't used
by anyone. Plus the same information can be obtained from
Stack.NICAddressRanges().

PiperOrigin-RevId: 267229843
2019-09-04 14:19:32 -07:00
Adin Scannell 67a2ab1438 Impose order on test scripts.
The simple test script has gotten out of control. Shard this script into
different pieces and attempt to impose order on overall test structure. This
change helps lay some of the foundations for future improvements.

 * The runsc/test directories are moved into just test/.
 * The runsc/test/testutil package is split into logical pieces.
 * The scripts/ directory contains new top-level targets.
 * Each test is now responsible for building targets it requires.
 * The install functionality is moved into `runsc` itself for simplicity.
 * The existing kokoro run_tests.sh file now just calls all (can be split).

After this change is merged,  I will create multiple distinct workflows for
Kokoro, one for each of the scripts currently targeted by `run_tests.sh` today,
which should dramatically reduce the time-to-run for the Kokoro tests, and
provides a better foundation for further improvements to the infrastructure.

PiperOrigin-RevId: 267081397
2019-09-03 22:02:43 -07:00
Ghanan Gowripalan 144127e5e1 Validate IPv6 Hop Limit field for received NDP packets
Make sure that NDP packets are only received if their IP header's hop limit
field is set to 255, as per RFC 4861.

PiperOrigin-RevId: 267061457
2019-09-03 18:43:12 -07:00
Bhasker Hariharan 3789c34b22 Make UDP traceroute work.
Adds support to generate Port Unreachable messages for UDP
datagrams received on a port for which there is no valid
endpoint.

Fixes #703

PiperOrigin-RevId: 267034418
2019-09-03 16:01:17 -07:00
Jamie Liu eb94066ef2 Ensure that flipcall.Endpoint.Shutdown() shuts down inactive peers.
PiperOrigin-RevId: 267022978
2019-09-03 15:10:51 -07:00
Haibo Xu fa151e3971 Remove duplicated file in pkg/tcpip/link/rawfile.
The blockingpoll_unsafe.go was copied to blockingpoll_noyield_unsafe.go
during merging commit 7206202bb9. If it still stay here, it would
cause build errors on non-amd64 platform.

ERROR:
pkg/tcpip/link/rawfile/BUILD:5:1:
GoCompilePkg
pkg/tcpip/link/rawfile.a
failed (Exit 1) builder failed: error executing command
bazel-out/host/bin/external/go_sdk/builder compilepkg -sdk
external/go_sdk -installsuffix linux_arm64 -src
pkg/tcpip/link/rawfile/blockingpoll_noyield_unsafe.go -src ...
(remaining 33 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
compilepkg: error running subcommand: exit status 2
pkg/tcpip/link/rawfile/blockingpoll_yield_unsafe.go:35:6:
BlockingPoll redeclared in this block
        previous declaration at
pkg/tcpip/link/rawfile/blockingpoll_unsafe.go:26:78
Target //pkg/tcpip/link/rawfile:rawfile failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 25.531s, Critical Path: 21.08s
INFO: 262 processes: 262 linux-sandbox.
FAILED: Build did NOT complete successfully

Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: I4e21f82984225d0aa173de456f7a7c66053a053e
2019-09-02 02:49:41 +00:00
Jamie Liu 0352cf5866 Remove support for non-incremental mapped accounting.
PiperOrigin-RevId: 266496644
2019-08-30 19:06:55 -07:00
Bhasker Hariharan 54bf2e8eff Automated rollback of changelist 261387276
PiperOrigin-RevId: 266491264
2019-08-30 18:15:32 -07:00
Chris Kuiper afbdf2f212 Fix data race accessing referencedNetworkEndpoint.kind
Wrapping "kind" into atomic access functions.

Fixes #789

PiperOrigin-RevId: 266485501
2019-08-30 17:23:53 -07:00
Fabricio Voznika 502c47f7a7 Return correct buffer size for ioctl(socket, FIONREAD)
Ioctl was returning just the buffer size from epsocket.endpoint
and it was not considering data from epsocket.SocketOperations
that was read from the endpoint, but not yet sent to the caller.

PiperOrigin-RevId: 266485461
2019-08-30 17:19:09 -07:00
Rahat Mahmood 863e11ac4d Implement /proc/net/udp.
PiperOrigin-RevId: 266229756
2019-08-29 14:30:41 -07:00
gVisor bot 0789b9cc08 Merge pull request #655 from praveensastry:feature/runsc-ref-chk-leak
PiperOrigin-RevId: 266226714
2019-08-29 14:17:32 -07:00
Jamie Liu 36a8949b2a Add limit_host_fd_translation Gofer mount option.
PiperOrigin-RevId: 266177409
2019-08-29 14:01:03 -07:00
Tamir Duberstein 24ecce5dbf Export generated linkAddrEntryEntry
PiperOrigin-RevId: 266000128
2019-08-28 14:56:33 -07:00
Tamir Duberstein 313c767b00 Populate link address cache at dispatch
This allows the stack to learn remote link addresses on incoming
packets, reducing the need to ARP to send responses.

This also reduces the number of round trips to the system clock,
since that may also prove to be performance-sensitive.

Fixes #739.

PiperOrigin-RevId: 265815816
2019-08-27 18:54:56 -07:00
Michael Pratt 9679f9891f Fix comment typo
PiperOrigin-RevId: 265731735
2019-08-27 11:44:06 -07:00
Fabricio Voznika 8fd89fd7a2 Fix sendfile(2) error code
When output file is in append mode, sendfile(2) should fail
with EINVAL and not EBADF.

Closes #721

PiperOrigin-RevId: 265718958
2019-08-27 10:52:46 -07:00
Fabricio Voznika c39564332b Mount volumes as super user
This used to be the case, but regressed after a recent change.
Also made a few fixes around it and clean up the code a bit.

Closes #720

PiperOrigin-RevId: 265717496
2019-08-27 10:47:16 -07:00
Robert Tonic c319b360d1 First pass at implementing Unix Domain Socket support. No tests.
This commit adds support for detecting the socket file type, connecting 
to a Unix Domain Socket, and providing bidirectional communication 
(without file descriptor transfer support).
2019-08-27 13:08:56 -04:00
Rahat Mahmood 1fdefd41c5 netstack/tcp: Add LastAck transition.
Add missing state transition to LastAck, which should happen when the
endpoint has already recieved a FIN from the remote side, and is
sending its own FIN.

PiperOrigin-RevId: 265568314
2019-08-26 16:39:13 -07:00
Michael Pratt 904b156962 Add support for Intel cache CPUID leafs
This exposes L1, L2, etc. cache sizes, cache line size, etc.

Across S/R, everything except cache line size can differ from the host. This is
because cache line size is critical for correct use of CLFLUSH / CLFLUSHOPT,
but as far as I know, the other cache parameters can only affect performance,
not correctness.

AMD uses different leafs for cache information, which are not yet supported.

fail. There are no known cases of cache line size other than 64 in the fleet.

PiperOrigin-RevId: 265544786
2019-08-26 14:47:05 -07:00
gVisor bot 7206202bb9 Merge pull request #696 from xiaobo55x:tcpip_link
PiperOrigin-RevId: 265534854
2019-08-26 14:03:30 -07:00
Chris Kuiper ac2200b8a9 Prevent a network endpoint to send/rcv if its address was removed
This addresses the problem where an endpoint has its address removed but still
has outstanding references held by routes used in connected TCP/UDP sockets
which prevent the removal of the endpoint.

The fix adds a new "expired" flag to the referenced network endpoint, which is
set when an endpoint has its address removed. Incoming packets are not
delivered to an expired endpoint (unless in promiscuous mode), while sending
outgoing packets triggers an error to the caller (unless in spoofing mode).

In addition, a few helper functions were added to stack_test.go to reduce
code duplications.

PiperOrigin-RevId: 265514326
2019-08-26 12:29:47 -07:00
Tamir Duberstein e75a12e89d Implement fmt.Stringer on Route by value
This is more convenient, since it implements the interface for both
value and pointer.

PiperOrigin-RevId: 265086510
2019-08-23 10:44:11 -07:00
Adin Scannell 761e4bf2fe Ensure yield-equivalent with an already-expired timeout.
PiperOrigin-RevId: 264920977
2019-08-22 14:34:33 -07:00
praveensastry 7672eaae25 Add log prefix for better clarity 2019-08-22 22:52:43 +10:00
Chris Kuiper 8d9276ed56 Support binding to multicast and broadcast addresses
This fixes the issue of not being able to bind to either a multicast or
broadcast address as well as to send and receive data from it. The way to solve
this is to treat these addresses similar to the ANY address and register their
transport endpoint ID with the global stack's demuxer rather than the NIC's.
That way there is no need to require an endpoint with that multicast or
broadcast address. The stack's demuxer is in fact the only correct one to use,
because neither broadcast- nor multicast-bound sockets care which NIC a
packet was received on (for multicast a join is still needed to receive packets
on a NIC).

I also took the liberty of refactoring udp_test.go to consolidate a lot of
duplicate code and make it easier to create repetitive tests that test the same
feature for a variety of packet and socket types. For this purpose I created a
"flowType" that represents two things: 1) the type of packet being sent or
received and 2) the type of socket used for the test. E.g., a "multicastV4in6"
flow represents a V4-mapped multicast packet run through a V6-dual socket.

This allows writing significantly simpler tests. A nice example is testTTL().

PiperOrigin-RevId: 264766909
2019-08-21 22:54:25 -07:00
Tamir Duberstein 573e6e4bba Use tcpip.Subnet in tcpip.Route
This is the first step in replacing some of the redundant types with the
standard library equivalents.

PiperOrigin-RevId: 264706552
2019-08-21 15:31:18 -07:00
Chris Kuiper 7e79ca0225 Add tcpip.Route.String and tcpip.AddressMask.Prefix
PiperOrigin-RevId: 264544163
2019-08-20 23:28:52 -07:00
Zach Koopmans 67d7864f83 Document RWF_HIPRI not implemented for preadv2/pwritev2.
Document limitation of no reasonable implementation for RWF_HIPRI
flag (High Priority Read/Write for block-based file systems).

PiperOrigin-RevId: 264237589
2019-08-19 14:07:44 -07:00
gVisor bot 3ffbdffd7e Internal change.
PiperOrigin-RevId: 264218306
2019-08-19 12:43:22 -07:00
Jianfeng Tan a63f88855f hostinet: fix parsing route netlink message
We wrongly parses output interface as gateway address.
The fix is straightforward.

Fixes #638

Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
Change-Id: Ia4bab31f3c238b0278ea57ab22590fad00eaf061
COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/684 from tanjianfeng:fix-638 b940e810367ad1273519bfa594f4371bdd293e83
PiperOrigin-RevId: 264211336
2019-08-19 12:10:21 -07:00
Kevin Krakauer bd826092fe Read iptables via sockopts.
PiperOrigin-RevId: 264180125
2019-08-19 10:05:59 -07:00
Andrei Vagin 3e4102b2ea netstack: disconnect an unix socket only if the address family is AF_UNSPEC
Linux allows to call connect for ANY and the zero port.

PiperOrigin-RevId: 263892534
2019-08-16 19:32:14 -07:00
Ayush Ranjan 661b2b9f69 procfs: Migrate seqfile implementations.
Migrates all (except 3) seqfile implementations to the vfs.DynamicBytesSource
interface. There should not be any change in functionality due to this migration
itself.

Please note that the following seqfile implementations have not been migrated:
- /proc/filesystems in proc/filesystems.go
- /proc/[pid]/mountinfo in proc/mounts.go
- /proc/[pid]/mounts in proc/mounts.go
This is because these depend on pending changes in /pkg/senty/vfs.

PiperOrigin-RevId: 263880719
2019-08-16 17:36:42 -07:00
Andrei Vagin 2a1303357c ptrace: detect if a stub process exited unexpectedly
PiperOrigin-RevId: 263880577
2019-08-16 17:33:28 -07:00
Chris Kuiper f7114e0a27 Add subnet checking to NIC.findEndpoint and consolidate with NIC.getRef
This adds the same logic to NIC.findEndpoint that is already done in
NIC.getRef. Since this makes the two functions very similar they were combined
into one with the originals being wrappers.

PiperOrigin-RevId: 263864708
2019-08-16 15:58:58 -07:00
Ayush Ranjan 4bab7d7f08 vfs: Remove vfs.DefaultDirectoryFD from embedding vfs.DefaultFD.
This fixes the implementation ambiguity issues when a filesystem
implementation embeds vfs.DefaultDirectoryFD to its directory FD along
with an internal common fileDescription utility.

For similar reasons also removes FileDescriptionDefaultImpl from
DynamicBytesFileDescriptionImpl.

PiperOrigin-RevId: 263795513
2019-08-16 10:20:11 -07:00
Rahat Mahmood 6cfc76798b Document source and versioning of the TCPInfo struct.
PiperOrigin-RevId: 263637194
2019-08-15 14:05:59 -07:00
Tamir Duberstein fe74bba2bd Don't dereference errors passed to panic()
These errors are always pointers; there's no sense in dereferencing them
in the panic call. Changed one false positive for clarity.

PiperOrigin-RevId: 263611579
2019-08-15 11:58:16 -07:00
Tamir Duberstein 816a9211e9 netstack: move resumption logic into *_state.go
13a98df rearranged some of this code in a way that broke compilation of
the netstack-only export at github.com/google/netstack because
*_state.go files are not included in that export.

This commit moves resumption logic back into *_state.go, fixing the
compilation breakage.

PiperOrigin-RevId: 263601629
2019-08-15 11:13:46 -07:00
Haibo Xu 1b1e39d7a1 Enabling pkg/tcpip/link support on arm64.
Signed-off-by: Haibo Xu haibo.xu@arm.com
Change-Id: Ib6b4aa2db19032e58bf0395f714e6883caee460a
2019-08-15 03:19:30 +00:00
Haibo Xu 52843719ca Rename fdbased/mmap.go to fdbased/mmap_stub.go.
Signed-off-by: Haibo Xu haibo.xu@arm.com
Change-Id: Id4489554b9caa332695df8793d361f8332f6a13b
2019-08-15 03:19:22 +00:00
Haibo Xu 0624858593 Rename rawfile/blockingpoll_unsafe.go to rawfile/blockingpoll_stub_unsafe.go.
Signed-off-by: Haibo Xu haibo.xu@arm.com
Change-Id: I2376e502c1a860d5e624c8a8e3afab5da4c53022
2019-08-15 03:19:14 +00:00
Tamir Duberstein d81d94ac4c Replace uinptr with int64 when returning lengths
This is in accordance with newer parts of the standard library.

PiperOrigin-RevId: 263449916
2019-08-14 16:05:56 -07:00
Tamir Duberstein 69d1414a32 Add tcpip.AddressWithPrefix.String
PiperOrigin-RevId: 263436592
2019-08-14 15:02:14 -07:00
Bhasker Hariharan 570fb1db6b Improve SendMsg performance.
SendMsg before this change would copy all the data over into a
new slice even if the underlying socket could only accept a
small amount of data. This is really inefficient with non-blocking
sockets and under high throughput where large writes could get
ErrWouldBlock or if there was say a timeout associated with the sendmsg()
syscall.

With this change we delay copying bytes in till they are needed and only
copy what can be potentially sent/held in the socket buffer. Reducing
the need to repeatedly copy data over.

Also a minor fix to change state FIN-WAIT-1 when shutdown(..., SHUT_WR) is called
instead of when we transmit the actual FIN. Otherwise the socket could remain in
CONNECTED state even though the user has called shutdown() on the socket.

Updates #627

PiperOrigin-RevId: 263430505
2019-08-14 14:34:27 -07:00
Jamie Liu cee044c2ab Add vfs.DynamicBytesFileDescriptionImpl.
This replaces fs/proc/seqfile for vfs2-based filesystems.

PiperOrigin-RevId: 263254647
2019-08-13 17:54:24 -07:00
Fabricio Voznika 0e907c4298 Fix file mode check in pipeOperations
PiperOrigin-RevId: 263203441
2019-08-13 13:33:33 -07:00
Ian Gudger 072d941e32 Add note to name logging mentioning trace logging should be enabled to debug.
PiperOrigin-RevId: 263194584
2019-08-13 12:49:18 -07:00
Ian Gudger 99bf75a6dc gonet: Replace NewPacketConn with DialUDP.
This better matches the standard library and allows creating connected
PacketConns.

PiperOrigin-RevId: 263187462
2019-08-13 12:11:09 -07:00
Nicolas Lacasse 9769a8eaa4 Handle ENOSPC with a partial write.
Similar to the EPIPE case, we can return the number of bytes written before
ENOSPC was encountered. If the app tries to write more, we can return ENOSPC on
the next write.

PiperOrigin-RevId: 263041648
2019-08-12 17:41:33 -07:00
Rahat Mahmood 691c2f8173 Compute size of struct tcp_info instead of hardcoding it.
PiperOrigin-RevId: 263040624
2019-08-12 17:34:38 -07:00
Ian Gudger eac690e358 Fix netstack build error on non-AMD64.
This stub had the wrong function signature.

PiperOrigin-RevId: 262992682
2019-08-12 13:31:16 -07:00
Andrei Vagin af90e68623 netlink: return an error in nlmsgerr
Now if a process sends an unsupported netlink requests,
an error is returned from the send system call.

The linux kernel works differently in this case. It returns errors in the
nlmsgerr netlink message.

Reported-by: syzbot+571d99510c6f935202da@syzkaller.appspotmail.com
PiperOrigin-RevId: 262690453
2019-08-09 22:34:54 -07:00
Bhasker Hariharan 5a38eb120a Add congestion control states to sender.
This change just introduces different congestion control states and
ensures the sender.state is updated to reflect the current state
of the connection.

It is not used for any decisions yet but this is required before
algorithms like Eiffel/PRR can be implemented.

Fixes #394

PiperOrigin-RevId: 262638292
2019-08-09 14:50:30 -07:00
Haibo Xu 1c9da886e7 Add initial ptrace stub and syscall support for arm64.
Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: I1dbd23bb240cca71d0cc30fc75ca5be28cb4c37c
PiperOrigin-RevId: 262619519
2019-08-09 13:18:11 -07:00
Ayush Ranjan c8961a6cbd ext: Move to pkg/sentry/fsimpl.
fsimpl is the keeper of all filesystem implementations in VFS2.

PiperOrigin-RevId: 262617869
2019-08-09 13:08:28 -07:00
praveensastry 73985c6545 Fix the Stringer for leak mode 2019-08-09 17:13:06 +10:00
Ayush Ranjan 690308111c ext: Benchmark tests.
Added benchmark tests which emulate memfs benchmarks.

Stat benchmarks
BenchmarkVFS2Ext4fsStat/1-12      	10000000	       145 ns/op
BenchmarkVFS2Ext4fsStat/2-12      	10000000	       170 ns/op
BenchmarkVFS2Ext4fsStat/3-12      	10000000	       202 ns/op
BenchmarkVFS2Ext4fsStat/8-12      	 3000000	       374 ns/op
BenchmarkVFS2Ext4fsStat/64-12     	  500000	      2159 ns/op
BenchmarkVFS2Ext4fsStat/100-12    	  300000	      3459 ns/op

BenchmarkVFS1TmpfsStat/1-12       	 5000000	       348 ns/op
BenchmarkVFS1TmpfsStat/2-12       	 3000000	       487 ns/op
BenchmarkVFS1TmpfsStat/3-12       	 2000000	       655 ns/op
BenchmarkVFS1TmpfsStat/8-12       	 1000000	      1365 ns/op
BenchmarkVFS1TmpfsStat/64-12      	  200000	      9565 ns/op
BenchmarkVFS1TmpfsStat/100-12     	  100000	     15158 ns/op

BenchmarkVFS2MemfsStat/1-12       	10000000	       133 ns/op
BenchmarkVFS2MemfsStat/2-12       	10000000	       155 ns/op
BenchmarkVFS2MemfsStat/3-12       	10000000	       182 ns/op
BenchmarkVFS2MemfsStat/8-12       	 5000000	       310 ns/op
BenchmarkVFS2MemfsStat/64-12      	 1000000	      1659 ns/op
BenchmarkVFS2MemfsStat/100-12     	  500000	      2787 ns/op

Mount Stat benchmarks
BenchmarkVFS2ExtfsMountStat/1-12  	 5000000	       245 ns/op
BenchmarkVFS2ExtfsMountStat/2-12  	 5000000	       266 ns/op
BenchmarkVFS2ExtfsMountStat/3-12  	 5000000	       304 ns/op
BenchmarkVFS2ExtfsMountStat/8-12  	 3000000	       456 ns/op
BenchmarkVFS2ExtfsMountStat/64-12 	  500000	      2308 ns/op
BenchmarkVFS2ExtfsMountStat/100-12   300000	      3482 ns/op

BenchmarkVFS1TmpfsMountStat/1-12  	 3000000	       488 ns/op
BenchmarkVFS1TmpfsMountStat/2-12  	 2000000	       658 ns/op
BenchmarkVFS1TmpfsMountStat/3-12  	 2000000	       806 ns/op
BenchmarkVFS1TmpfsMountStat/8-12  	 1000000	      1514 ns/op
BenchmarkVFS1TmpfsMountStat/64-12 	  100000	     10037 ns/op
BenchmarkVFS1TmpfsMountStat/100-12        100000	     15280 ns/op

BenchmarkVFS2MemfsMountStat/1-12           	10000000	       212 ns/op
BenchmarkVFS2MemfsMountStat/2-12           	 5000000	       232 ns/op
BenchmarkVFS2MemfsMountStat/3-12           	 5000000	       264 ns/op
BenchmarkVFS2MemfsMountStat/8-12           	 3000000	       390 ns/op
BenchmarkVFS2MemfsMountStat/64-12          	 1000000	      1813 ns/op
BenchmarkVFS2MemfsMountStat/100-12         	  500000	      2812 ns/op

PiperOrigin-RevId: 262477158
2019-08-08 18:45:37 -07:00
Rahat Mahmood 7bfad8ebb6 Return a well-defined socket address type from socket funtions.
Previously we were representing socket addresses as an interface{},
which allowed any type which could be binary.Marshal()ed to be used as
a socket address. This is fine when the address is passed to userspace
via the linux ABI, but is problematic when used from within the sentry
such as by networking procfs files.

PiperOrigin-RevId: 262460640
2019-08-08 16:50:33 -07:00
Rahat Mahmood 13a98df49e netstack: Don't start endpoint goroutines too soon on restore.
Endpoint protocol goroutines were previously started as part of
loading the endpoint. This is potentially too soon, as resources used
by these goroutine may not have been loaded. Protocol goroutines may
perform meaningful work as soon as they're started (ex: incoming
connect) which can cause them to indirectly access resources that
haven't been loaded yet.

This CL defers resuming all protocol goroutines until the end of
restore.

PiperOrigin-RevId: 262409429
2019-08-08 12:33:11 -07:00
gVisor bot 2e45d1696e Merge pull request #653 from xiaobo55x:dev
PiperOrigin-RevId: 262402929
2019-08-08 11:58:14 -07:00
Jamie Liu 06102af65a memfs fixes.
- Unexport Filesystem/Dentry/Inode.

- Support SEEK_CUR in directoryFD.Seek().

- Hold Filesystem.mu before touching directoryFD.off in
directoryFD.Seek().

- Remove deleted Dentries from their parent directory.childLists.

- Remove invalid FIXMEs.

PiperOrigin-RevId: 262400633
2019-08-08 11:46:38 -07:00
Ayush Ranjan 08cd5e1d36 ext: Seek unit tests.
PiperOrigin-RevId: 262264674
2019-08-07 19:13:41 -07:00
Ayush Ranjan 40d6d8c15b ext: StatAt unit tests.
PiperOrigin-RevId: 262249166
2019-08-07 17:21:00 -07:00
Ayush Ranjan 3b368cabf9 ext: Read unit tests.
PiperOrigin-RevId: 262242410
2019-08-07 16:44:10 -07:00
Ayush Ranjan ad67e5a7a0 ext: IterDirent unit tests.
PiperOrigin-RevId: 262226761
2019-08-07 15:24:33 -07:00
Ayush Ranjan 1c9781a4ed ext: vfs.FileDescriptionImpl and vfs.FilesystemImpl implementations.
- This also gets rid of pipes for now because pipe does not have vfs2 specific
  support yet.
- Added file path resolution logic.
- Fixes testing infrastructure.
- Does not include unit tests yet.

PiperOrigin-RevId: 262213950
2019-08-07 14:23:42 -07:00