gvisor

Commit Graph

Author	SHA1	Message	Date
Ian Gudger	a2c51efe36	Add endpoint tracking to the stack. In the future this will replace DanglingEndpoints. DanglingEndpoints must be kept for now due to issues with save/restore. This is arguably a cleaner design and allows the stack to know which transport endpoints might still be using its link endpoints. Updates #837 PiperOrigin-RevId: 277386633	2019-10-29 16:14:51 -07:00
Ian Gudger	8f029b3f82	Convert DelayOption to the newer/faster SockOpt int type. DelayOption is set on all new endpoints in gVisor. PiperOrigin-RevId: 276746791	2019-10-25 13:15:34 -07:00
gVisor bot	6d4d9564e3	Merge pull request #641 from tanjianfeng:master PiperOrigin-RevId: 276380008	2019-10-23 16:55:15 -07:00
Kevin Krakauer	12235d533a	AF_PACKET support for netstack (aka epsocket). Like (AF_INET, SOCK_RAW) sockets, AF_PACKET sockets require CAP_NET_RAW. With runsc, you'll need to pass `--net-raw=true` to enable them. Binding isn't supported yet. PiperOrigin-RevId: 275909366	2019-10-21 13:23:18 -07:00
Kevin Krakauer	2a82d5ad68	Reorder BUILD license and load functions in gvisor. PiperOrigin-RevId: 275139066	2019-10-16 16:40:30 -07:00
gVisor bot	d22f0534c0	Merge pull request #736 from tanjianfeng:fix-unix PiperOrigin-RevId: 275114157	2019-10-16 14:41:43 -07:00
Jianfeng Tan	d277bfba27	epsocket: support /proc/net/snmp Netstack has its own stats, we use this to fill /proc/net/snmp. Note that some metrics are not recorded in Netstack, which will be shown as 0 in the proc file. Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com> Change-Id: Ie0089184507d16f49bc0057b4b0482094417ebe1	2019-10-15 16:38:41 +00:00
Jianfeng Tan	aee2c93366	netstack: add counters for tcp CurrEstab and EstabResets Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>	2019-10-15 16:38:40 +00:00
Jianfeng Tan	dd7d1f825d	hostinet: support /proc/net/snmp and /proc/net/dev For hostinet, we inherit the data from host procfs. To to that, we cache the fds for these files for later reads. Fixes #506 Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com> Change-Id: I2f81215477455b9c59acf67e33f5b9af28ee0165	2019-10-15 16:38:40 +00:00
gVisor bot	bfa0bb24dd	Internal change. PiperOrigin-RevId: 274700093	2019-10-14 17:46:52 -07:00
Ian Lewis	470997ca99	Allow for zero byte iovec with MSG_PEEK \| MSG_TRUNC in recvmsg. This allows for peeking at the length of the next message on a netlink socket without pulling it off the socket's buffer/queue, allowing tools like 'ip' to work. This CL also fixes an issue where dump_done_errno was not included in the NLMSG_DONE messages payload. Issue #769 PiperOrigin-RevId: 274068637	2019-10-10 16:55:48 -07:00
Bhasker Hariharan	c7e901f47a	Fix bugs in fragment handling. Strengthen the header.IPv4.IsValid check to correctly check for IHL/TotalLength fields. Also add a check to make sure fragmentOffsets + size of the fragment do not cause a wrap around for the end of the fragment. PiperOrigin-RevId: 274049313	2019-10-10 15:14:55 -07:00
gVisor bot	bf870c1a42	Internal change. PiperOrigin-RevId: 273861936	2019-10-09 17:56:05 -07:00
Ian Gudger	7c1587e340	Implement IP_TTL. Also change the default TTL to 64 to match Linux. PiperOrigin-RevId: 273430341	2019-10-07 19:29:51 -07:00
Kevin Krakauer	6a98237949	Rename epsocket to netstack. PiperOrigin-RevId: 273365058	2019-10-07 13:57:59 -07:00
gVisor bot	abbee5615f	Implement SO_BINDTODEVICE sockopt PiperOrigin-RevId: 271644926	2019-09-27 14:14:04 -07:00
Kevin Krakauer	543492650d	Make raw socket tests pass in environments with or without CAP_NET_RAW. PiperOrigin-RevId: 271442321	2019-09-26 15:09:20 -07:00
Andrei Vagin	03ee55cc62	netstack: convert more socket options to {Set,Get}SockOptInt PiperOrigin-RevId: 270763208	2019-09-23 14:39:14 -07:00
gVisor bot	4aeedd47bf	internal BUILD file cleanup. PiperOrigin-RevId: 270680704	2019-09-23 08:25:13 -07:00
Adin Scannell	75781ab3ef	Remove defer from hot path and ensure Atomic is applied consistently. PiperOrigin-RevId: 270114317	2019-09-19 13:39:32 -07:00
Adin Scannell	7c6ab6a219	Implement splice methods for pipes and sockets. This also allows the tee(2) implementation to be enabled, since dup can now be properly supported via WriteTo. Note that this change necessitated some minor restructoring with the fs.FileOperations splice methods. If the *fs.File is passed through directly, then only public API methods are accessible, which will deadlock immediately since the locking is already done by fs.Splice. Instead, we pass through an abstract io.Reader or io.Writer, which elide locks and use the underlying fs.FileOperations directly. PiperOrigin-RevId: 268805207	2019-09-12 17:43:27 -07:00
Michael Pratt	df5d377521	Remove go_test from go_stateify and go_marshal They are no-ops, so the standard rule works fine. PiperOrigin-RevId: 268776264	2019-09-12 15:10:17 -07:00
Fabricio Voznika	502c47f7a7	Return correct buffer size for ioctl(socket, FIONREAD) Ioctl was returning just the buffer size from epsocket.endpoint and it was not considering data from epsocket.SocketOperations that was read from the endpoint, but not yet sent to the caller. PiperOrigin-RevId: 266485461	2019-08-30 17:19:09 -07:00
Rahat Mahmood	863e11ac4d	Implement /proc/net/udp. PiperOrigin-RevId: 266229756	2019-08-29 14:30:41 -07:00
Jianfeng Tan	2c3e2ed2bf	unix: return ECONNRESET if peer closed with data not read For SOCK_STREAM type unix socket, we shall return ECONNRESET if peer is closed with data not read. We explictly set a flag when closing one end, to differentiate from just shutdown (where zero shall be returned). Fixes: #735 Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>	2019-08-22 15:25:38 +00:00
Jianfeng Tan	96f78e2466	unix: return zero if peer is closed Previously, recvmsg() on a unix stream socket with its peer closed will never return, with goroutine call trace like this: ... 2 in gvisor.dev/gvisor/pkg/sentry/kernel.(Task).block at pkg/sentry/kernel/task_block.go:124 3 in gvisor.dev/gvisor/pkg/sentry/kernel.(Task).BlockWithDeadline at pkg/sentry/kernel/task_block.go:69 4 in gvisor.dev/gvisor/pkg/sentry/socket/unix.(SocketOperations).RecvMsg at pkg/sentry/socket/unix/unix.go:612 5 in gvisor.dev/gvisor/pkg/sentry/syscalls/linux.recvFrom at pkg/sentry/syscalls/linux/sys_socket.go:885 6 in gvisor.dev/gvisor/pkg/sentry/syscalls/linux.RecvFrom at pkg/sentry/syscalls/linux/sys_socket.go:910 ... The issue is caused by that ErrClosedForReceive returned by unix/transport.queue is turned into nil in unix.(EndpointReader).ReadToBlocks(): err.ToError() As a result, in unix.(*SocketOperations).RecvMsg(): n == 0 and err == nil We shall differentiate it from another case - no data to read where ErrWouldBlock shall be returned; and return 0 immediately. Fixes: #734 Reported-by: chenglang.hy <chenglang.hy@antfin.com> Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com>	2019-08-22 15:25:38 +00:00
Tamir Duberstein	573e6e4bba	Use tcpip.Subnet in tcpip.Route This is the first step in replacing some of the redundant types with the standard library equivalents. PiperOrigin-RevId: 264706552	2019-08-21 15:31:18 -07:00
Jianfeng Tan	a63f88855f	hostinet: fix parsing route netlink message We wrongly parses output interface as gateway address. The fix is straightforward. Fixes #638 Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com> Change-Id: Ia4bab31f3c238b0278ea57ab22590fad00eaf061 COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/684 from tanjianfeng:fix-638 b940e810367ad1273519bfa594f4371bdd293e83 PiperOrigin-RevId: 264211336	2019-08-19 12:10:21 -07:00
Kevin Krakauer	bd826092fe	Read iptables via sockopts. PiperOrigin-RevId: 264180125	2019-08-19 10:05:59 -07:00
Andrei Vagin	3e4102b2ea	netstack: disconnect an unix socket only if the address family is AF_UNSPEC Linux allows to call connect for ANY and the zero port. PiperOrigin-RevId: 263892534	2019-08-16 19:32:14 -07:00
Tamir Duberstein	d81d94ac4c	Replace uinptr with int64 when returning lengths This is in accordance with newer parts of the standard library. PiperOrigin-RevId: 263449916	2019-08-14 16:05:56 -07:00
Bhasker Hariharan	570fb1db6b	Improve SendMsg performance. SendMsg before this change would copy all the data over into a new slice even if the underlying socket could only accept a small amount of data. This is really inefficient with non-blocking sockets and under high throughput where large writes could get ErrWouldBlock or if there was say a timeout associated with the sendmsg() syscall. With this change we delay copying bytes in till they are needed and only copy what can be potentially sent/held in the socket buffer. Reducing the need to repeatedly copy data over. Also a minor fix to change state FIN-WAIT-1 when shutdown(..., SHUT_WR) is called instead of when we transmit the actual FIN. Otherwise the socket could remain in CONNECTED state even though the user has called shutdown() on the socket. Updates #627 PiperOrigin-RevId: 263430505	2019-08-14 14:34:27 -07:00
Andrei Vagin	af90e68623	netlink: return an error in nlmsgerr Now if a process sends an unsupported netlink requests, an error is returned from the send system call. The linux kernel works differently in this case. It returns errors in the nlmsgerr netlink message. Reported-by: syzbot+571d99510c6f935202da@syzkaller.appspotmail.com PiperOrigin-RevId: 262690453	2019-08-09 22:34:54 -07:00
Rahat Mahmood	7bfad8ebb6	Return a well-defined socket address type from socket funtions. Previously we were representing socket addresses as an interface{}, which allowed any type which could be binary.Marshal()ed to be used as a socket address. This is fine when the address is passed to userspace via the linux ABI, but is problematic when used from within the sentry such as by networking procfs files. PiperOrigin-RevId: 262460640	2019-08-08 16:50:33 -07:00
Rahat Mahmood	13a98df49e	netstack: Don't start endpoint goroutines too soon on restore. Endpoint protocol goroutines were previously started as part of loading the endpoint. This is potentially too soon, as resources used by these goroutine may not have been loaded. Protocol goroutines may perform meaningful work as soon as they're started (ex: incoming connect) which can cause them to indirectly access resources that haven't been loaded yet. This CL defers resuming all protocol goroutines until the end of restore. PiperOrigin-RevId: 262409429	2019-08-08 12:33:11 -07:00
gVisor bot	2e45d1696e	Merge pull request #653 from xiaobo55x:dev PiperOrigin-RevId: 262402929	2019-08-08 11:58:14 -07:00
Haibo Xu	83fdb7739e	Change syscall.EPOLLET to unix.EPOLLET syscall.EPOLLET has been defined with different values on amd64 and arm64(-0x80000000 on amd64, and 0x80000000 on arm64), while unix.EPOLLET has been unified this value to 0x80000000(golang/go#5328). ref #63 Signed-off-by: Haibo Xu <haibo.xu@arm.com> Change-Id: Id97d075c4e79d86a2ea3227ffbef02d8b00ffbb8	2019-08-05 23:10:08 +00:00
Kevin Krakauer	810cc07aab	Plumbing for iptables sockopts. PiperOrigin-RevId: 261413396	2019-08-02 16:26:48 -07:00
Rahat Mahmood	2906dffcdb	Automated rollback of changelist 261191548 PiperOrigin-RevId: 261373749	2019-08-02 12:52:40 -07:00
Rahat Mahmood	79511e8a50	Implement getsockopt(TCP_INFO). Export some readily-available fields for TCP_INFO and stub out the rest. PiperOrigin-RevId: 261191548	2019-08-01 13:58:48 -07:00
Ian Lewis	0a246fab80	Basic support for 'ip route' Implements support for RTM_GETROUTE requests for netlink sockets. Fixes #507 PiperOrigin-RevId: 261051045	2019-07-31 20:30:09 -07:00
Chris Kuiper	40e682759f	Add support for a subnet prefix length on interface network addresses This allows the user code to add a network address with a subnet prefix length. The prefix length value is stored in the network endpoint and provided back to the user in the ProtocolAddress type. PiperOrigin-RevId: 259807693	2019-07-24 13:42:14 -07:00
Andrei Vagin	eefa817cfd	net/tcp/setockopt: impelment setsockopt(fd, SOL_TCP, TCP_INQ) PiperOrigin-RevId: 258859507	2019-07-18 15:41:04 -07:00
Kevin Krakauer	9f1189130e	Add AF_UNIX, SOCK_RAW sockets, which exist for some reason. tcpdump creates these. PiperOrigin-RevId: 258611829	2019-07-17 11:49:16 -07:00
Jianfeng Tan	cf4fc510fd	Support /proc/net/dev This proc file reports the stats of interfaces. We could use ifconfig command to check the result. Signed-off-by: Jianfeng Tan <henry.tjf@antfin.com> Change-Id: Ia7c1e637f5c76c30791ffda68ee61e861b6ef827 COPYBARA_INTEGRATE_REVIEW=https://gvisor-review.googlesource.com/c/gvisor/+/18282/ PiperOrigin-RevId: 258303936	2019-07-15 22:51:05 -07:00
Kevin Krakauer	9b4d3280e1	Add IPPROTO_RAW, which allows raw sockets to write IP headers. iptables also relies on IPPROTO_RAW in a way. It opens such a socket to manipulate the kernel's tables, but it doesn't actually use any of the functionality. Blegh. PiperOrigin-RevId: 257903078	2019-07-12 18:09:12 -07:00
Bhasker Hariharan	6116473b2f	Stub out support for TCP_MAXSEG. Adds support to set/get the TCP_MAXSEG value but does not really change the segment sizes emitted by netstack or alter the MSS advertised by the endpoint. This is currently being added only to unblock iperf3 on gVisor. Plumbing this correctly requires a bit more work which will come in separate CLs. PiperOrigin-RevId: 257859112	2019-07-12 13:35:17 -07:00
Andrei Vagin	116cac053e	netstack/udp: connect with the AF_UNSPEC address family means disconnect PiperOrigin-RevId: 256433283	2019-07-03 14:19:02 -07:00
Adin Scannell	753da9604e	Remove map from fd_map, change to fd_table. This renames FDMap to FDTable and drops the kernel.FD type, which had an entire package to itself and didn't serve much use (it was freely cast between types, and served as more of an annoyance than providing any protection.) Based on BenchmarkFDLookupAndDecRef-12, we can expect 5-10 ns per lookup operation, and 10-15 ns per concurrent lookup operation of savings. This also fixes two tangential usage issues with the FDMap. Namely, non-atomic use of NewFDFrom and associated calls to Remove (that are both racy and fail to drop the reference on the underlying file.) PiperOrigin-RevId: 256285890	2019-07-02 19:28:59 -07:00
Ian Gudger	0aa9418a77	Fix unix/transport.queue reference leaks. Fix two leaks for connectionless Unix sockets: * Double connect: Subsequent connects would leak a reference on the previously connected endpoint. * Close unconnected: Sockets which were not connected at the time of closure would leak a reference on their receiver. PiperOrigin-RevId: 256070451	2019-07-01 17:46:24 -07:00
Ian Gudger	45566fa4e4	Add finalizer on AtomicRefCount to check for leaks. PiperOrigin-RevId: 255711454	2019-06-28 20:07:52 -07:00
Fabricio Voznika	b2907595e5	Complete pipe support on overlayfs Get/Set pipe size and ioctl support were missing from overlayfs. It required moving the pipe.Sizer interface to fs so that overlay could get access. Fixes #318 PiperOrigin-RevId: 255511125	2019-06-27 17:22:53 -07:00
Michael Pratt	5b41ba5d0e	Fix various spelling issues in the documentation Addresses obvious typos, in the documentation only. COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/443 from Pixep:fix/documentation-spelling 4d0688164eafaf0b3010e5f4824b35d1e7176d65 PiperOrigin-RevId: 255477779	2019-06-27 14:25:50 -07:00
Andrei Vagin	8ab0848c70	gvisor/fs: don't update file.offset for sockets, pipes, etc sockets, pipes and other non-seekable file descriptors don't use file.offset, so we don't need to update it. With this change, we will be able to call file operations without locking the file.mu mutex. This is already used for pipes in the splice system call. PiperOrigin-RevId: 253746644	2019-06-18 01:43:29 -07:00
Bhasker Hariharan	3d71c627fa	Add support for TCP receive buffer auto tuning. The implementation is similar to linux where we track the number of bytes consumed by the application to grow the receive buffer of a given TCP endpoint. This ensures that the advertised window grows at a reasonable rate to accomodate for the sender's rate and prevents large amounts of data being held in stack buffers if the application is not actively reading or not reading fast enough. The original paper that was used to implement the linux receive buffer auto- tuning is available @ https://public.lanl.gov/radiant/pubs/drs/lacsi2001.pdf NOTE: Linux does not implement DRS as defined in that paper, it's just a good reference to understand the solution space. Updates #230 PiperOrigin-RevId: 253168283	2019-06-13 22:28:01 -07:00
Ian Gudger	3e9b8ecbfe	Plumb context through more layers of filesytem. All functions which allocate objects containing AtomicRefCounts will soon need a context. PiperOrigin-RevId: 253147709	2019-06-13 18:40:38 -07:00
Rahat Mahmood	05ff1ffaad	Implement getsockopt() SO_DOMAIN, SO_PROTOCOL and SO_TYPE. SO_TYPE was already implemented for everything but netlink sockets. PiperOrigin-RevId: 253138157	2019-06-13 17:24:51 -07:00
Adin Scannell	add40fd6ad	Update canonical repository. This can be merged after: https://github.com/google/gvisor-website/pull/77 or https://github.com/google/gvisor-website/pull/78 PiperOrigin-RevId: 253132620	2019-06-13 16:50:15 -07:00
Bhasker Hariharan	70578806e8	Add support for TCP_CONGESTION socket option. This CL also cleans up the error returned for setting congestion control which was incorrectly returning EINVAL instead of ENOENT. PiperOrigin-RevId: 252889093	2019-06-12 13:35:50 -07:00
Rahat Mahmood	a00157cc0e	Store more information in the kernel socket table. Store enough information in the kernel socket table to distinguish between different types of sockets. Previously we were only storing the socket family, but this isn't enough to classify sockets. For example, TCPv4 and UDPv4 sockets are both AF_INET, and ICMP sockets are SOCK_DGRAM sockets with a particular protocol. Instead of creating more sub-tables, flatten the socket table and provide a filtering mechanism based on the socket entry. Also generate and store a socket entry index ("sl" in linux) which allows us to output entries in a stable order from procfs. PiperOrigin-RevId: 252495895	2019-06-10 15:17:43 -07:00
Rahat Mahmood	315cf9a523	Use common definition of SockType. SockType isn't specific to unix domain sockets, and the current definition basically mirrors the linux ABI's definition. PiperOrigin-RevId: 251956740	2019-06-06 17:00:27 -07:00
Rahat Mahmood	2d2831e354	Track and export socket state. This is necessary for implementing network diagnostic interfaces like /proc/net/{tcp,udp,unix} and sock_diag(7). For pass-through endpoints such as hostinet, we obtain the socket state from the backend. For netstack, we add explicit tracking of TCP states. PiperOrigin-RevId: 251934850	2019-06-06 15:04:47 -07:00
Andrei Vagin	90a116890f	gvisor/sock/unix: pass creds when a message is sent between unconnected sockets and don't report a sender address if it doesn't have one PiperOrigin-RevId: 251371284	2019-06-03 21:48:19 -07:00
Bhasker Hariharan	ae26b2c425	Fixes to TCP listen behavior. Netstack listen loop can get stuck if cookies are in-use and the app is slow to accept incoming connections. Further we continue to complete handshake for a connection even if the backlog is full. This creates a problem when a lots of connections come in rapidly and we end up with lots of completed connections just hanging around to be delivered. These fixes change netstack behaviour to mirror what linux does as described here in the following article http://veithen.io/2014/01/01/how-tcp-backlog-works-in-linux.html Now when cookies are not in-use Netstack will silently drop the ACK to a SYN-ACK and not complete the handshake if the backlog is full. This will result in the connection staying in a half-complete state. Eventually the sender will retransmit the ACK and if backlog has space we will transition to a connected state and deliver the endpoint. Similarly when cookies are in use we do not try and create an endpoint unless there is space in the accept queue to accept the newly created endpoint. If there is no space then we again silently drop the ACK as we can just recreate it when the ACK is retransmitted by the peer. We also now use the backlog to cap the size of the SYN-RCVD queue for a given endpoint. So at any time there can be N connections in the backlog and N in a SYN-RCVD state if the application is not accepting connections. Any new SYNs will be dropped. This CL also fixes another small bug where we mark a new endpoint which has not completed handshake as connected. We should wait till handshake successfully completes before marking it connected. Updates #236 PiperOrigin-RevId: 250717817	2019-05-30 12:08:41 -07:00
Andrei Vagin	4b9cb38157	gvisor: socket() returns EPROTONOSUPPORT if protocol is not supported PiperOrigin-RevId: 250426407	2019-05-30 12:06:15 -07:00
Kevin Krakauer	c1cdf18e7b	UDP and TCP raw socket support. PiperOrigin-RevId: 249511348 Change-Id: I34539092cc85032d9473ff4dd308fc29dc9bfd6b	2019-05-22 13:45:15 -07:00
Adin Scannell	9cdae51fec	Add basic plumbing for splice and stub implementation. This does not actually implement an efficient splice or sendfile. Rather, it adds a generic plumbing to the file internals so that this can be added. All file implementations use the stub fileutil.NoSplice implementation, which causes sendfile and splice to fall back to an internal copy. A basic splice system call interface is added, along with a test. PiperOrigin-RevId: 249335960 Change-Id: Ic5568be2af0a505c19e7aec66d5af2480ab0939b	2019-05-21 15:18:12 -07:00
Ian Gudger	81ecd8b6ea	Implement the MSG_CTRUNC msghdr flag for Unix sockets. Updates google/gvisor#206 PiperOrigin-RevId: 245880573 Change-Id: Ifa715e98d47f64b8a32b04ae9378d6cd6bd4025e	2019-04-29 21:21:08 -07:00
Michael Pratt	4d52a55201	Change copyright notice to "The gVisor Authors" Based on the guidelines at https://opensource.google.com/docs/releasing/authors/. 1. $ rg -l "Google LLC" \| xargs sed -i 's/Google LLC.*/The gVisor Authors./' 2. Manual fixup of "Google Inc" references. 3. Add AUTHORS file. Authors may request to be added to this file. 4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS. Fixes #209 PiperOrigin-RevId: 245823212 Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9	2019-04-29 14:26:23 -07:00
Nicolas Lacasse	f4ce43e1f4	Allow and document bug ids in gVisor codebase. PiperOrigin-RevId: 245818639 Change-Id: I03703ef0fb9b6675955637b9fe2776204c545789	2019-04-29 14:04:14 -07:00
Ian Gudger	358eb52a76	Add support for the MSG_TRUNC msghdr flag. The MSG_TRUNC flag is set in the msghdr when a message is truncated. Fixes google/gvisor#200 PiperOrigin-RevId: 244440486 Change-Id: I03c7d5e7f5935c0c6b8d69b012db1780ac5b8456	2019-04-19 16:17:01 -07:00
Ian Gudger	133700007a	Only emit unimplemented syscall events for unsupported values. Only emit unimplemented syscall events for setting SO_OOBINLINE and SO_LINGER when attempting to set unsupported values. PiperOrigin-RevId: 244229675 Change-Id: Icc4562af8f733dd75a90404621711f01a32a9fc1	2019-04-18 11:51:41 -07:00
Michael Pratt	08d99c5fbe	Convert poll/select to operate more directly on linux.PollFD Current, doPoll copies the user struct pollfd array into a []syscalls.PollFD, which contains internal kdefs.FD and waiter.EventMask types. While these are currently binary-compatible with the Linux versions, we generally discourage copying directly to internal types (someone may inadvertantly change kdefs.FD to uint64). Instead, copy directly to a []linux.PollFD, which will certainly be binary compatible. Most of syscalls/polling.go is included directly into syscalls/linux/sys_poll.go, as it can then operate directly on linux.PollFD. The additional syscalls.PollFD type is providing little value. I've also added explicit conversion functions for waiter.EventMask, which creates the possibility of a different binary format. PiperOrigin-RevId: 244042947 Change-Id: I24e5b642002a32b3afb95a9dcb80d4acd1288abf	2019-04-17 12:15:01 -07:00
Jamie Liu	4209edafb6	Use open fids when fstat()ing gofer files. PiperOrigin-RevId: 243018347 Change-Id: I1e5b80607c1df0747482abea61db7fcf24536d37	2019-04-11 00:43:04 -07:00
Bhasker Hariharan	eaac2806ff	Add TCP checksum verification. PiperOrigin-RevId: 242704699 Change-Id: I87db368ca343b3b4bf4f969b17d3aa4ce2f8bd4f	2019-04-09 11:23:47 -07:00
Bert Muthalaly	f2e5dcf21c	Add ICMP stats PiperOrigin-RevId: 240848882 Change-Id: I23dd4599f073263437aeab357c3f767e1a432b82	2019-03-28 14:09:20 -07:00
Rahat Mahmood	81f4829d11	Record sockets created during accept(2) for all families. Track new sockets created during accept(2) in the socket table for all families. Previously we were only doing this for unix domain sockets. PiperOrigin-RevId: 239475550 Change-Id: I16f009f24a06245bfd1d72ffd2175200f837c6ac	2019-03-20 14:31:16 -07:00
Fabricio Voznika	7b33df6845	Fix data race in netlink send buffer size PiperOrigin-RevId: 239221041 Change-Id: Icc19e32a00fa89167447ab2f45e90dcfd61bea04	2019-03-19 10:38:50 -07:00
Ian Gudger	71d53382bf	Fix getsockopt(IP_MULTICAST_IF). getsockopt(IP_MULTICAST_IF) only supports struct in_addr. Also adds support for setsockopt(IP_MULTICAST_IF) with struct in_addr. PiperOrigin-RevId: 237620230 Change-Id: I75e7b5b3e08972164eb1906f43ddd67aedffc27c	2019-03-09 11:40:51 -08:00
Ian Gudger	281092e842	Make IP_MULTICAST_LOOP and IP_MULTICAST_TTL allow setting int or char. This is the correct Linux behavior, and at least PHP depends on it. PiperOrigin-RevId: 237565639 Change-Id: I931af09c8ed99a842cf70d22bfe0b65e330c4137	2019-03-08 20:27:58 -08:00
Ian Gudger	56a6128295	Implement IP_MULTICAST_LOOP. IP_MULTICAST_LOOP controls whether or not multicast packets sent on the default route are looped back. In order to implement this switch, support for sending and looping back multicast packets on the default route had to be implemented. For now we only support IPv4 multicast. PiperOrigin-RevId: 237534603 Change-Id: I490ac7ff8e8ebef417c7eb049a919c29d156ac1c	2019-03-08 15:49:17 -08:00
Bhasker Hariharan	1718fdd1a8	Add new retransmissions and recovery related metrics. PiperOrigin-RevId: 236945145 Change-Id: I051760d95154ea5574c8bb6aea526f488af5e07b	2019-03-05 16:41:44 -08:00
Kevin Krakauer	23e66ee96d	Remove unused commit() function argument to Bind. PiperOrigin-RevId: 236926132 Change-Id: I5cf103f22766e6e65a581de780c7bb9ca0fa3181	2019-03-05 14:53:34 -08:00
Kevin Krakauer	121db29a93	Ping support via IPv4 raw sockets. Broadly, this change: * Enables sockets to be created via `socket(AF_INET, SOCK_RAW, IPPROTO_ICMP)`. * Passes the network-layer (IP) header up the stack to the transport endpoint, which can pass it up to the socket layer. This allows a raw socket to return the entire IP packet to users. * Adds functions to stack.TransportProtocol, stack.Stack, stack.transportDemuxer that enable incoming packets to be delivered to raw endpoints. New raw sockets of other protocols (not ICMP) just need to register with the stack. * Enables ping.endpoint to return IP headers when created via SOCK_RAW. PiperOrigin-RevId: 235993280 Change-Id: I60ed994f5ff18b2cbd79f063a7fdf15d093d845a	2019-02-27 14:31:21 -08:00
Fabricio Voznika	cff2c57192	Fix bad merge PiperOrigin-RevId: 235818534 Change-Id: I99f7e3fd1dc808b35f7a08b96b7c3226603ab808	2019-02-26 16:42:06 -08:00
Amanda Tait	ea070b9d5f	Implement Broadcast support This change adds support for the SO_BROADCAST socket option in gVisor Netstack. This support includes getsockopt()/setsockopt() functionality for both UDP and TCP endpoints (the latter being a NOOP), dispatching broadcast messages up and down the stack, and route finding/creation for broadcast packets. Finally, a suite of tests have been implemented, exercising this functionality through the Linux syscall API. PiperOrigin-RevId: 234850781 Change-Id: If3e666666917d39f55083741c78314a06defb26c	2019-02-20 12:54:13 -08:00
Kevin Krakauer	ec2460b189	netstack: Add SIOCGSTAMP support. Ping sometimes uses this instead of SO_TIMESTAMP. PiperOrigin-RevId: 234699590 Change-Id: Ibec9c34fa0d443a931557a2b1b1ecd83effe7765	2019-02-19 16:41:32 -08:00
Ian Gudger	c611dbc5a7	Implement IP_MULTICAST_IF. This allows setting a default send interface for IPv4 multicast. IPv6 support will come later. PiperOrigin-RevId: 234251379 Change-Id: I65922341cd8b8880f690fae3eeb7ddfa47c8c173	2019-02-15 18:40:15 -08:00
Kevin Krakauer	a9cb3dcd9d	Move SO_TIMESTAMP from different transport endpoints to epsocket. SO_TIMESTAMP is reimplemented in ping and UDP sockets (and needs to be added for TCP), but can just be implemented in epsocket for simplicity. This will also make SIOCGSTAMP easier to implement. PiperOrigin-RevId: 234179300 Change-Id: Ib5ea0b1261dc218c1a8b15a65775de0050fe3230	2019-02-15 11:18:44 -08:00
Fabricio Voznika	e34d27e8b6	Redirect FIXME to more appropriate bug PiperOrigin-RevId: 234147487 Change-Id: I779a6012832bb94a6b89f5bcc7d821b40ae969cc	2019-02-15 08:23:27 -08:00
Ian Gudger	80f901b16b	Plumb IP_ADD_MEMBERSHIP and IP_DROP_MEMBERSHIP to netstack. Also includes a few fixes for IPv4 multicast support. IPv6 support is coming in a followup CL. PiperOrigin-RevId: 233008638 Change-Id: If7dae6222fef43fda48033f0292af77832d95e82	2019-02-07 23:15:23 -08:00
Rahat Mahmood	2ba74f84be	Implement /proc/net/unix. PiperOrigin-RevId: 232948478 Change-Id: Ib830121e5e79afaf5d38d17aeef5a1ef97913d23	2019-02-07 14:44:21 -08:00
Michael Pratt	2a0c69b19f	Remove license comments Nothing reads them and they can simply get stale. Generated with: $ sed -i "s/licenses($.$)./licenses(\1)/" **/BUILD PiperOrigin-RevId: 231818945 Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0	2019-01-31 11:12:53 -08:00
Nicolas Lacasse	dc8450b567	Remove fs.Handle, ramfs.Entry, and all the DeprecatedFileOperations. More helper structs have been added to the fsutil package to make it easier to implement fs.InodeOperations and fs.FileOperations. PiperOrigin-RevId: 229305982 Change-Id: Ib6f8d3862f4216745116857913dbfa351530223b	2019-01-14 20:34:28 -08:00
Andrei Vagin	652d068119	Implement SO_REUSEPORT for TCP and UDP sockets This option allows multiple sockets to be bound to the same port. Incoming packets are distributed to sockets using a hash based on source and destination addresses. This means that all packets from one sender will be received by the same server socket. PiperOrigin-RevId: 227153413 Change-Id: I59b6edda9c2209d5b8968671e9129adb675920cf	2018-12-28 11:27:14 -08:00
Ian Gudger	bce2f9751f	Plumb IP_MULTICAST_TTL to netstack. PiperOrigin-RevId: 226993086 Change-Id: I71757f231436538081d494da32ca69f709bc71c7	2018-12-26 23:52:12 -08:00
Ian Gudger	0df0df35fc	Stub out SO_OOBINLINE. We don't explicitly support out-of-band data and treat it like normal in-band data. This is equilivent to SO_OOBINLINE being enabled, so always report that it is enabled. PiperOrigin-RevId: 226572742 Change-Id: I4c30ccb83265e76c30dea631cbf86822e6ee1c1b	2018-12-21 19:46:55 -08:00
Ian Gudger	b515556519	Implement SO_KEEPALIVE, TCP_KEEPIDLE, and TCP_KEEPINTVL. Within gVisor, plumb new socket options to netstack. Within netstack, fix GetSockOpt and SetSockOpt return value logic. PiperOrigin-RevId: 226532229 Change-Id: If40734e119eed633335f40b4c26facbebc791c74	2018-12-21 13:13:45 -08:00
Ian Gudger	12c7430a01	Fix recv blocking for connectionless Unix sockets. Connectionless Unix sockets (DGRAM Unix sockets created with the socket system call) inherently only have a read queue. They do not establish bidirectional connections, instead, the connect system call only sets a default send location. Writes give the data to the other endpoint which has its own read queue. To simplify the code, connectionless Unix sockets still get read and write queues, but the write queue is a dummy and never waited on. The read queue is the connectionless endpoint's queue. This change fixes a bug where the dummy queue was incorrectly set as the read queue and the endpoint's queue was incorrectly set as the write queue. This meant that read notifications went to the dummy queue and were black holed. PiperOrigin-RevId: 225921042 Change-Id: I8d9059def787a2c3c305185b92d05093fbd2be2a	2018-12-17 17:53:22 -08:00
Adin Scannell	5d8cf31346	Move fdnotifier package to reduce internal confusion. PiperOrigin-RevId: 225632398 Change-Id: I909e7e2925aa369adc28e844c284d9a6108e85ce	2018-12-14 18:05:01 -08:00

1 2 3 4 5

212 Commits