Commit Graph

92 Commits

Author SHA1 Message Date
Nayana Bidari b6d165fe98 Automated rollback of changelist 329526153
PiperOrigin-RevId: 332097286
2020-09-16 15:06:55 -07:00
Rahat Mahmood 3ca73841d7 Move the 'marshal' and 'primitive' packages to the 'pkg' directory.
PiperOrigin-RevId: 331256608
2020-09-11 17:42:49 -07:00
Bhasker Hariharan b69352245a Fix Accept to not return error for sockets in accept queue.
Accept on gVisor will return an error if a socket in the accept queue was closed
before Accept() was called. Linux will return the new fd even if the returned
socket is already closed by the peer say due to a RST being sent by the peer.

This seems to be intentional in linux more details on the github issue.

Fixes #3780

PiperOrigin-RevId: 329828404
2020-09-02 18:21:47 -07:00
Nayana Bidari 0eae08bc9e Automated rollback of changelist 328350576
PiperOrigin-RevId: 329526153
2020-09-01 09:54:55 -07:00
Ghanan Gowripalan 6f8fb7e0db Improve type safety for socket options
The existing implementation for {G,S}etSockOpt take arguments of an
empty interface type which all types (implicitly) implement; any
type may be passed to the functions.

This change introduces marker interfaces for socket options that may be
set or queried which socket option types implement to ensure that invalid
types are caught at compile time. Different interfaces are used to allow
the compiler to enforce read-only or set-only socket options.

Fixes #3714.

RELNOTES: n/a
PiperOrigin-RevId: 328832161
2020-08-27 15:46:44 -07:00
Ghanan Gowripalan dc81eb9c37 Add function to get error from a tcpip.Endpoint
In an upcoming CL, socket option types are made to implement a marker
interface with pointer receivers. Since this results in calling methods
of an interface with a pointer, we incur an allocation when attempting
to get an Endpoint's last error with the current implementation.

When calling the method of an interface, the compiler is unable to
determine what the interface implementation does with the pointer
(since calling a method on an interface uses virtual dispatch at runtime
so the compiler does not know what the interface method will do) so it
allocates on the heap to be safe incase an implementation continues to
hold the pointer after the functioon returns (the reference escapes the
scope of the object).

In the example below, the compiler does not know what b.foo does with
the reference to a it allocates a on the heap as the reference to a may
escape the scope of a.
```
var a int
var b someInterface
b.foo(&a)
```

This change removes the opportunity for that allocation.

RELNOTES: n/a
PiperOrigin-RevId: 328796559
2020-08-27 12:50:19 -07:00
Dean Deng df3c105f49 Use new reference count utility throughout gvisor.
This uses the refs_vfs2 template in vfs2 as well as objects common to vfs1 and
vfs2. Note that vfs1-only refcounts are not replaced, since vfs1 will be deleted
soon anyway.

The following structs now use the new tool, with leak check enabled:
devpts:rootInode
fuse:inode
kernfs:Dentry
kernfs:dir
kernfs:readonlyDir
kernfs:StaticDirectory
proc:fdDirInode
proc:fdInfoDirInode
proc:subtasksInode
proc:taskInode
proc:tasksInode
vfs:FileDescription
vfs:MountNamespace
vfs:Filesystem
sys:dir
kernel:FSContext
kernel:ProcessGroup
kernel:Session
shm:Shm
mm:aioMappable
mm:SpecialMappable
transport:queue

And the following use the template, but because they currently are not leak
checked, a TODO is left instead of enabling leak check in this patch:
kernel:FDTable
tun:tunEndpoint

Updates #1486.

PiperOrigin-RevId: 328460377
2020-08-25 21:04:04 -07:00
Kevin Krakauer 3cba0a41d9 remove iptables sockopt special cases
iptables sockopts were kludged into an unnecessary check, this properly
relegates them to the {get,set}SockOptIP functions.

PiperOrigin-RevId: 328395135
2020-08-25 13:43:26 -07:00
Nayana Bidari b26f7503b5 Support SO_LINGER socket option.
When SO_LINGER option is enabled, the close will not return until all the
queued messages are sent and acknowledged for the socket or linger timeout is
reached. If the option is not set, close will return immediately. This option
is mainly supported for connection oriented protocols such as TCP.

PiperOrigin-RevId: 328350576
2020-08-25 10:04:07 -07:00
Dean Deng 3bd066d503 Remove weak references from unix sockets.
The abstract socket namespace no longer holds any references on sockets.
Instead, TryIncRef() is used when a socket is being retrieved in
BoundEndpoint(). Abstract sockets are now responsible for removing themselves
from the namespace they are in, when they are destroyed.

Updates #1486.

PiperOrigin-RevId: 327064173
2020-08-17 11:42:20 -07:00
Nayana Bidari b2ae7ea1bb Plumbing context.Context to DecRef() and Release().
context is passed to DecRef() and Release() which is
needed for SO_LINGER implementation.

PiperOrigin-RevId: 324672584
2020-08-03 13:36:05 -07:00
Ayush Ranjan 6f7f739967 Marshallable socket opitons.
Socket option values are now required to implement marshal.Marshallable.

Co-authored-by: Rahat Mahmood <rahat@google.com>
PiperOrigin-RevId: 322831612
2020-07-23 11:45:10 -07:00
Andrei Vagin 70c45e09cf socket/unix: (*connectionedEndpoint).State() has to take the endpoint lock
It accesses e.receiver which is protected by the endpoint lock.

WARNING: DATA RACE
Write at 0x00c0006aa2b8 by goroutine 189:
  pkg/sentry/socket/unix/transport.(*connectionedEndpoint).Connect.func1()
      pkg/sentry/socket/unix/transport/connectioned.go:359 +0x50
  pkg/sentry/socket/unix/transport.(*connectionedEndpoint).BidirectionalConnect()
      pkg/sentry/socket/unix/transport/connectioned.go:327 +0xa3c
  pkg/sentry/socket/unix/transport.(*connectionedEndpoint).Connect()
      pkg/sentry/socket/unix/transport/connectioned.go:363 +0xca
  pkg/sentry/socket/unix.(*socketOpsCommon).Connect()
      pkg/sentry/socket/unix/unix.go:420 +0x13a
  pkg/sentry/socket/unix.(*SocketOperations).Connect()
      <autogenerated>:1 +0x78
  pkg/sentry/syscalls/linux.Connect()
      pkg/sentry/syscalls/linux/sys_socket.go:286 +0x251

Previous read at 0x00c0006aa2b8 by goroutine 270:
  pkg/sentry/socket/unix/transport.(*baseEndpoint).Connected()
      pkg/sentry/socket/unix/transport/unix.go:789 +0x42
  pkg/sentry/socket/unix/transport.(*connectionedEndpoint).State()
      pkg/sentry/socket/unix/transport/connectioned.go:479 +0x2f
  pkg/sentry/socket/unix.(*socketOpsCommon).State()
      pkg/sentry/socket/unix/unix.go:714 +0xc3e
  pkg/sentry/socket/unix.(*socketOpsCommon).SendMsg()
      pkg/sentry/socket/unix/unix.go:466 +0xc44
  pkg/sentry/socket/unix.(*SocketOperations).SendMsg()
      <autogenerated>:1 +0x173
  pkg/sentry/syscalls/linux.sendTo()
      pkg/sentry/syscalls/linux/sys_socket.go:1121 +0x4c5
  pkg/sentry/syscalls/linux.SendTo()
      pkg/sentry/syscalls/linux/sys_socket.go:1134 +0x87

Reported-by: syzbot+c2be37eedc672ed59a86@syzkaller.appspotmail.com
PiperOrigin-RevId: 317236996
2020-06-18 20:28:10 -07:00
Fabricio Voznika 96519e2c9d Implement POSIX locks
- Change FileDescriptionImpl Lock/UnlockPOSIX signature to
  take {start,length,whence}, so the correct offset can be
  calculated in the implementations.
- Create PosixLocker interface to make it possible to share
  the same locking code from different implementations.

Closes #1480

PiperOrigin-RevId: 316910286
2020-06-17 10:04:26 -07:00
Andrei Vagin a5a4f80487 socket/unix: handle sendto address argument for connected sockets
In case of SOCK_SEQPACKET, it has to be ignored.
In case of SOCK_STREAM, EISCONN or EOPNOTSUPP has to be returned.

PiperOrigin-RevId: 315755972
2020-06-10 13:26:54 -07:00
Fabricio Voznika 67565078bb Implement flock(2) in VFS2
LockFD is the generic implementation that can be embedded in
FileDescriptionImpl implementations. Unique lock ID is
maintained in vfs.FileDescription and is created on demand.

Updates #1480

PiperOrigin-RevId: 315604825
2020-06-09 18:46:42 -07:00
Andrei Vagin e6334e81ca Check that two sockets with different types can't be connected to each other
PiperOrigin-RevId: 314450191
2020-06-02 19:19:15 -07:00
Jamie Liu 9115f26851 Allocate device numbers for VFS2 filesystems.
Updates #1197, #1198, #1672

PiperOrigin-RevId: 310432006
2020-05-07 14:01:53 -07:00
Dean Deng faf89dd31a Update vfs2 socket TODOs.
Three updates:
- Mark all vfs2 socket syscalls as supported.
- Use the same dev number and ino number generator for all types of sockets,
  unlike in VFS1.
- Do not use host fd for hostinet metadata.

Fixes #1476, #1478, #1484, 1485, #2017.

PiperOrigin-RevId: 309994579
2020-05-05 12:11:14 -07:00
Dean Deng 82bae30cee Port netstack, hostinet, and netlink sockets to VFS2.
All three follow the same pattern:
1. Refactor VFS1 sockets into socketOpsCommon, so that most of the methods can
   be shared with VFS2.
2. Create a FileDescriptionImpl with the corresponding socket operations,
   rewriting the few that cannot be shared with VFS1.
3. Set up a VFS2 socket provider that creates a socket by setting up a dentry
   in the global Kernel.socketMount and connecting it with a new
   FileDescription.

This mostly completes the work for porting sockets to VFS2, and many syscall
tests can be enabled as a result.
There are several networking-related syscall tests that are still not passing:
1. net gofer tests
2. socketpair gofer tests
2. sendfile tests (splice is not implemented in VFS2 yet)

Updates #1478, #1484, #1485

PiperOrigin-RevId: 309457331
2020-05-01 12:54:41 -07:00
Dean Deng ce19497c1c Fix Unix socket permissions.
Enforce write permission checks in BoundEndpointAt, which corresponds to the
permission checks in Linux (net/unix/af_unix.c:unix_find_other).
Also, create bound socket files with the correct permissions in VFS2.

Fixes #2324.

PiperOrigin-RevId: 308949084
2020-04-28 20:13:01 -07:00
Dean Deng f93f2fda74 Deduplicate unix socket Release() method.
PiperOrigin-RevId: 308932254
2020-04-28 17:43:14 -07:00
Dean Deng f3ca5ca82a Support pipes and sockets in VFS2 gofer fs.
Named pipes and sockets can be represented in two ways in gofer fs:
1. As a file on the remote filesystem. In this case, all file operations are
   passed through 9p.
2. As a synthetic file that is internal to the sandbox. In this case, the
   dentry stores an endpoint or VFSPipe for sockets and pipes respectively,
   which replaces interactions with the remote fs through the gofer.
In gofer.filesystem.MknodAt, we attempt to call mknod(2) through 9p,
and if it fails, fall back to the synthetic version.

Updates #1200.

PiperOrigin-RevId: 308828161
2020-04-28 08:34:00 -07:00
Dean Deng 1c2ecbb1a0 Import host sockets.
The FileDescription implementation for hostfs sockets uses the standard Unix
socket implementation (unix.SocketVFS2), but is also tied to a hostfs dentry.

Updates #1672, #1476

PiperOrigin-RevId: 308716426
2020-04-27 16:02:18 -07:00
Tamir Duberstein b4de018a67 Permit setting unknown options
This previously changed in 305699233, but this behaviour turned out to
be load bearing.

PiperOrigin-RevId: 307033802
2020-04-17 06:41:38 -07:00
Andrei Vagin 7928aa345e Convert int and bool socket options to use GetSockOptInt and GetSockOptBool
PiperOrigin-RevId: 305699233
2020-04-09 09:31:48 -07:00
Dean Deng 24bee1c181 Record VFS2 sockets in global socket map.
Updates #1476, #1478, #1484, #1485.

PiperOrigin-RevId: 304845354
2020-04-04 21:02:42 -07:00
Dean Deng 5818663ebe Add FileDescriptionImpl for Unix sockets.
This change involves several steps:
- Refactor the VFS1 unix socket implementation to share methods between VFS1
  and VFS2 where possible. Re-implement the rest.
- Override the default PRead, Read, PWrite, Write, Ioctl, Release methods in
  FileDescriptionDefaultImpl.
- Add functions to create and initialize a new Dentry/Inode and FileDescription
  for a Unix socket file.

Updates #1476

PiperOrigin-RevId: 304689796
2020-04-03 14:08:54 -07:00
Dean Deng 4cb55a7a3b Prevent arbitrary size allocation when sending UDS messages.
Currently, Send() will copy data into a new byte slice without regard to the
original size. Size checks should be performed before the allocation takes
place.

Note that for the sake of performance, we avoid putting the buffer
allocation into the critical section. As a result, the size checks need to be
performed again within Enqueue() in case the limit has changed.

PiperOrigin-RevId: 292058147
2020-01-28 18:46:14 -08:00
Adin Scannell 0e2f1b7abd Update package locations.
Because the abi will depend on the core types for marshalling (usermem,
context, safemem, safecopy), these need to be flattened from the sentry
directory. These packages contain no sentry-specific details.

PiperOrigin-RevId: 291811289
2020-01-27 15:31:32 -08:00
Adin Scannell d29e59af9f Standardize on tools directory.
PiperOrigin-RevId: 291745021
2020-01-27 12:21:00 -08:00
Tamir Duberstein debd213da6 Allow dual stack sockets to operate on AF_INET
Fixes #1490
Fixes #1495

PiperOrigin-RevId: 289523250
2020-01-13 14:47:22 -08:00
Ian Gudger 27500d529f New sync package.
* Rename syncutil to sync.
* Add aliases to sync types.
* Replace existing usage of standard library sync package.

This will make it easier to swap out synchronization primitives. For example,
this will allow us to use primitives from github.com/sasha-s/go-deadlock to
check for lock ordering violations.

Updates #1472

PiperOrigin-RevId: 289033387
2020-01-09 22:02:24 -08:00
Ian Lewis fbb2c008e2 Return correct length with MSG_TRUNC for unix sockets.
This change calls a new Truncate method on the EndpointReader in RecvMsg for
both netlink and unix sockets.  This allows readers such as sockets to peek at
the length of data without actually reading it to a buffer.

Fixes #993 #1240

PiperOrigin-RevId: 288800167
2020-01-08 17:24:05 -08:00
Tamir Duberstein d530df2f95 Introduce tcpip.SockOptBool
...and port V6OnlyOption to it.

PiperOrigin-RevId: 288789451
2020-01-08 15:40:48 -08:00
Tamir Duberstein a271bccfc6 Rename tcpip.SockOpt{,Int}
PiperOrigin-RevId: 288772878
2020-01-08 14:20:07 -08:00
Andrei Vagin 378d6c1f36 unix: allow to bind unix sockets only to AF_UNIX addresses
Reported-by: syzbot+2c0bcfd87fb4e8b7b009@syzkaller.appspotmail.com
PiperOrigin-RevId: 285228312
2019-12-12 11:08:56 -08:00
Kevin Krakauer 2a82d5ad68 Reorder BUILD license and load functions in gvisor.
PiperOrigin-RevId: 275139066
2019-10-16 16:40:30 -07:00
gVisor bot d22f0534c0 Merge pull request #736 from tanjianfeng:fix-unix
PiperOrigin-RevId: 275114157
2019-10-16 14:41:43 -07:00
Kevin Krakauer 6a98237949 Rename epsocket to netstack.
PiperOrigin-RevId: 273365058
2019-10-07 13:57:59 -07:00
Andrei Vagin 03ee55cc62 netstack: convert more socket options to {Set,Get}SockOptInt
PiperOrigin-RevId: 270763208
2019-09-23 14:39:14 -07:00
Jianfeng Tan 2c3e2ed2bf unix: return ECONNRESET if peer closed with data not read
For SOCK_STREAM type unix socket, we shall return ECONNRESET if peer is
closed with data not read.

We explictly set a flag when closing one end, to differentiate from
just shutdown (where zero shall be returned).

Fixes: #735

Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
2019-08-22 15:25:38 +00:00
Jianfeng Tan 96f78e2466 unix: return zero if peer is closed
Previously, recvmsg() on a unix stream socket with its peer closed will
never return, with goroutine call trace like this:

  ...
  2  in gvisor.dev/gvisor/pkg/sentry/kernel.(*Task).block
     at pkg/sentry/kernel/task_block.go:124
  3  in gvisor.dev/gvisor/pkg/sentry/kernel.(*Task).BlockWithDeadline
     at pkg/sentry/kernel/task_block.go:69
  4  in gvisor.dev/gvisor/pkg/sentry/socket/unix.(*SocketOperations).RecvMsg
     at pkg/sentry/socket/unix/unix.go:612
  5  in gvisor.dev/gvisor/pkg/sentry/syscalls/linux.recvFrom
     at pkg/sentry/syscalls/linux/sys_socket.go:885
  6  in gvisor.dev/gvisor/pkg/sentry/syscalls/linux.RecvFrom
     at pkg/sentry/syscalls/linux/sys_socket.go:910
  ...

The issue is caused by that ErrClosedForReceive returned by
unix/transport.queue is turned into nil in
unix.(*EndpointReader).ReadToBlocks():

  err.ToError()

As a result, in unix.(*SocketOperations).RecvMsg():

  n == 0 and err == nil

We shall differentiate it from another case - no data to read where
ErrWouldBlock shall be returned; and return 0 immediately.

Fixes: #734

Reported-by: chenglang.hy <chenglang.hy@antfin.com>
Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>
2019-08-22 15:25:38 +00:00
Andrei Vagin 3e4102b2ea netstack: disconnect an unix socket only if the address family is AF_UNSPEC
Linux allows to call connect for ANY and the zero port.

PiperOrigin-RevId: 263892534
2019-08-16 19:32:14 -07:00
Tamir Duberstein d81d94ac4c Replace uinptr with int64 when returning lengths
This is in accordance with newer parts of the standard library.

PiperOrigin-RevId: 263449916
2019-08-14 16:05:56 -07:00
Rahat Mahmood 7bfad8ebb6 Return a well-defined socket address type from socket funtions.
Previously we were representing socket addresses as an interface{},
which allowed any type which could be binary.Marshal()ed to be used as
a socket address. This is fine when the address is passed to userspace
via the linux ABI, but is problematic when used from within the sentry
such as by networking procfs files.

PiperOrigin-RevId: 262460640
2019-08-08 16:50:33 -07:00
Kevin Krakauer 810cc07aab Plumbing for iptables sockopts.
PiperOrigin-RevId: 261413396
2019-08-02 16:26:48 -07:00
Andrei Vagin eefa817cfd net/tcp/setockopt: impelment setsockopt(fd, SOL_TCP, TCP_INQ)
PiperOrigin-RevId: 258859507
2019-07-18 15:41:04 -07:00
Kevin Krakauer 9f1189130e Add AF_UNIX, SOCK_RAW sockets, which exist for some reason.
tcpdump creates these.

PiperOrigin-RevId: 258611829
2019-07-17 11:49:16 -07:00
Andrei Vagin 116cac053e netstack/udp: connect with the AF_UNSPEC address family means disconnect
PiperOrigin-RevId: 256433283
2019-07-03 14:19:02 -07:00