gvisor

Commit Graph

Author	SHA1	Message	Date
Kevin Krakauer	2302afb53d	Reorder BUILD license and load functions in netstack. PiperOrigin-RevId: 274672346	2019-10-14 15:21:59 -07:00
Bhasker Hariharan	a296425970	Use a different fanoutID for each new fdbased endpoint. PiperOrigin-RevId: 274638272	2019-10-14 13:10:16 -07:00
Bhasker Hariharan	c7e901f47a	Fix bugs in fragment handling. Strengthen the header.IPv4.IsValid check to correctly check for IHL/TotalLength fields. Also add a check to make sure fragmentOffsets + size of the fragment do not cause a wrap around for the end of the fragment. PiperOrigin-RevId: 274049313	2019-10-10 15:14:55 -07:00
gVisor bot	bf870c1a42	Internal change. PiperOrigin-RevId: 273861936	2019-10-09 17:56:05 -07:00
Ian Gudger	7c1587e340	Implement IP_TTL. Also change the default TTL to 64 to match Linux. PiperOrigin-RevId: 273430341	2019-10-07 19:29:51 -07:00
Chris Kuiper	4874525161	Implement proper local broadcast behavior The behavior for sending and receiving local broadcast (255.255.255.255) traffic is as follows: Outgoing -------- * A broadcast packet sent on a socket that is bound to an interface goes out that interface * A broadcast packet sent on an unbound socket follows the route table to select the outgoing interface + if an explicit route entry exists for 255.255.255.255/32, use that one + else use the default route * Broadcast packets are looped back and delivered following the rules for incoming packets (see next). This is the same behavior as for multicast packets, except that it cannot be disabled via sockopt. Incoming -------- * Sockets wishing to receive broadcast packets must bind to either INADDR_ANY (0.0.0.0) or INADDR_BROADCAST (255.255.255.255). No other socket receives broadcast packets. * Broadcast packets are multiplexed to all sockets matching it. This is the same behavior as for multicast packets. * A socket can bind to 255.255.255.255:<port> and then receive its own broadcast packets sent to 255.255.255.255:<port> In addition, this change implicitly fixes an issue with multicast reception. If two sockets want to receive a given multicast stream and one is bound to ANY while the other is bound to the multicast address, only one of them will receive the traffic. PiperOrigin-RevId: 272792377	2019-10-03 19:31:35 -07:00
Bhasker Hariharan	bcbb3ef317	Add a Stringer implementation to PacketDispatchMode PiperOrigin-RevId: 272083936	2019-09-30 15:52:55 -07:00
Bhasker Hariharan	61f6fbd0ce	Fix bugs in PickEphemeralPort for TCP. Netstack always picks a random start point everytime PickEphemeralPort is called. While this is required for UDP so that DNS requests go out through a randomized set of ports it is not required for TCP. Infact Linux explicitly hashes the (srcip, dstip, dstport) and a one time secret initialized at start of the application to get a random offset. But to ensure it doesn't start from the same point on every scan it uses a static hint that is incremented by 2 in every call to pick ephemeral ports. The reason for 2 is Linux seems to split the port ranges where active connects seem to use even ones while odd ones are used by listening sockets. This CL implements a similar strategy where we use a hash + hint to generate the offset to start the search for a free Ephemeral port. This ensures that we cycle through the available port space in order for repeated connects to the same destination and significantly reduces the chance of picking a recently released port. PiperOrigin-RevId: 272058370	2019-09-30 13:55:22 -07:00
gVisor bot	abbee5615f	Implement SO_BINDTODEVICE sockopt PiperOrigin-RevId: 271644926	2019-09-27 14:14:04 -07:00
Kevin Krakauer	59ccbb1044	Remove centralized registration of protocols. Also removes the need for protocol names. PiperOrigin-RevId: 271186030	2019-09-25 12:57:05 -07:00
Chris Kuiper	6704d625ef	Return only primary addresses in Stack.NICInfo() Non-primary addresses are used for endpoints created to accept multicast and broadcast packets, as well as "helper" endpoints (0.0.0.0) that allow sending packets when no proper address has been assigned yet (e.g., for DHCP). These addresses are not real addresses from a user point of view and should not be part of the NICInfo() value. Also see b/127321246 for more info. This switches NICInfo() to call a new NIC.PrimaryAddresses() function. To still allow an option to get all addresses (mostly for testing) I added Stack.GetAllAddresses() and NIC.AllAddresses(). In addition, the return value for GetMainNICAddress() was changed for the case where the NIC has no primary address. Instead of returning an error here, it now returns an empty AddressWithPrefix() value. The rational for this change is that it is a valid case for a NIC to have no primary addresses. Lastly, I refactored the code based on the new additions. PiperOrigin-RevId: 270971764	2019-09-24 13:21:20 -07:00
Tamir Duberstein	bbaaa1fcc2	Simplify ICMPRateLimiter https://github.com/golang/time/commit/c4c64ca added SetBurst upstream. PiperOrigin-RevId: 270925077	2019-09-24 09:50:51 -07:00
Andrei Vagin	03ee55cc62	netstack: convert more socket options to {Set,Get}SockOptInt PiperOrigin-RevId: 270763208	2019-09-23 14:39:14 -07:00
Ian Gudger	002f1d4aae	Allow waiting for LinkEndpoint worker goroutines to finish. Previously, the only safe way to use an fdbased endpoint was to leak the FD. This change makes it possible to safely close the FD. This is the first step towards having stoppable stacks. Updates #837 PiperOrigin-RevId: 270346582	2019-09-20 14:10:02 -07:00
Ghanan Gowripalan	60fe8719e1	Automated rollback of changelist 268047073 PiperOrigin-RevId: 269658971	2019-09-17 14:47:09 -07:00
Ian Gudger	747320a7aa	Update remaining users of LinkEndpoints to not refer to them as an ID. PiperOrigin-RevId: 269614517	2019-09-17 11:31:00 -07:00
Adin Scannell	7c6ab6a219	Implement splice methods for pipes and sockets. This also allows the tee(2) implementation to be enabled, since dup can now be properly supported via WriteTo. Note that this change necessitated some minor restructoring with the fs.FileOperations splice methods. If the *fs.File is passed through directly, then only public API methods are accessible, which will deadlock immediately since the locking is already done by fs.Splice. Instead, we pass through an abstract io.Reader or io.Writer, which elide locks and use the underlying fs.FileOperations directly. PiperOrigin-RevId: 268805207	2019-09-12 17:43:27 -07:00
Michael Pratt	df5d377521	Remove go_test from go_stateify and go_marshal They are no-ops, so the standard rule works fine. PiperOrigin-RevId: 268776264	2019-09-12 15:10:17 -07:00
Ghanan Gowripalan	857940d30d	Automated rollback of changelist 268047073 PiperOrigin-RevId: 268757842	2019-09-12 13:52:25 -07:00
Ian Gudger	9dfcd8b09f	Fix ephemeral port leak. Fix a bug where udp.(endpoint).Disconnect [accessible in gVisor via epsocket.(SocketOperations).Connect with AF_UNSPEC] would leak a port reservation if the socket/endpoint had an ephemeral port assigned to it. glibc's getaddrinfo uses connect with AF_UNSPEC, causing each call of getaddrinfo to leak a port. Call getaddrinfo too many times and you run out of ports (shows up as connect returning EAGAIN and getaddrinfo returning EAI_NONAME "Name or service not known"). PiperOrigin-RevId: 268071160	2019-09-09 14:02:00 -07:00
Ghanan Gowripalan	a8943325db	Join IPv6 all-nodes and solicited-node multicast addresses where appropriate. The IPv6 all-nodes multicast address will be joined on NIC enable, and the appropriate IPv6 solicited-node multicast address will be joined when IPv6 addresses are added. Tests: Test receiving packets destined to the IPv6 link-local all-nodes multicast address and the IPv6 solicted node address of an added IPv6 address. PiperOrigin-RevId: 268047073	2019-09-09 12:06:06 -07:00
Ian Gudger	fe1f521077	Remove reundant global tcpip.LinkEndpointID. PiperOrigin-RevId: 267709597	2019-09-06 18:01:14 -07:00
Bhasker Hariharan	3dc3cffb2d	Fix RST generation bugs. There are a few cases addressed by this change - We no longer generate a RST in response to a RST packet. - When we receive a RST we cleanup and release all reservations immediately as the connection is now aborted. - An ACK received by a listening socket generates a RST when SYN cookies are not in-use. The only reason an ACK should land at the listening socket is if we are using SYN cookies otherwise the goroutine for the handshake in progress should have gotten the packet and it should never have arrived at the listening endpoint. - Also fixes the error returned when a connection times out due to a Keepalive timer expiration from ECONNRESET to a ETIMEDOUT. PiperOrigin-RevId: 267238427	2019-09-04 14:59:53 -07:00
Chris Kuiper	7bf1d426d5	Handle subnet and broadcast addresses correctly with NIC.subnets This also renames "subnet" to "addressRange" to avoid any more confusion with an interface IP's subnet. Lastly, this also removes the Stack.ContainsSubnet(..) API since it isn't used by anyone. Plus the same information can be obtained from Stack.NICAddressRanges(). PiperOrigin-RevId: 267229843	2019-09-04 14:19:32 -07:00
Ghanan Gowripalan	144127e5e1	Validate IPv6 Hop Limit field for received NDP packets Make sure that NDP packets are only received if their IP header's hop limit field is set to 255, as per RFC 4861. PiperOrigin-RevId: 267061457	2019-09-03 18:43:12 -07:00
Bhasker Hariharan	3789c34b22	Make UDP traceroute work. Adds support to generate Port Unreachable messages for UDP datagrams received on a port for which there is no valid endpoint. Fixes #703 PiperOrigin-RevId: 267034418	2019-09-03 16:01:17 -07:00
Haibo Xu	fa151e3971	Remove duplicated file in pkg/tcpip/link/rawfile. The blockingpoll_unsafe.go was copied to blockingpoll_noyield_unsafe.go during merging commit `7206202bb9`. If it still stay here, it would cause build errors on non-amd64 platform. ERROR: pkg/tcpip/link/rawfile/BUILD:5:1: GoCompilePkg pkg/tcpip/link/rawfile.a failed (Exit 1) builder failed: error executing command bazel-out/host/bin/external/go_sdk/builder compilepkg -sdk external/go_sdk -installsuffix linux_arm64 -src pkg/tcpip/link/rawfile/blockingpoll_noyield_unsafe.go -src ... (remaining 33 argument(s) skipped) Use --sandbox_debug to see verbose messages from the sandbox compilepkg: error running subcommand: exit status 2 pkg/tcpip/link/rawfile/blockingpoll_yield_unsafe.go:35:6: BlockingPoll redeclared in this block previous declaration at pkg/tcpip/link/rawfile/blockingpoll_unsafe.go:26:78 Target //pkg/tcpip/link/rawfile:rawfile failed to build Use --verbose_failures to see the command lines of failed build steps. INFO: Elapsed time: 25.531s, Critical Path: 21.08s INFO: 262 processes: 262 linux-sandbox. FAILED: Build did NOT complete successfully Signed-off-by: Haibo Xu <haibo.xu@arm.com> Change-Id: I4e21f82984225d0aa173de456f7a7c66053a053e	2019-09-02 02:49:41 +00:00
Chris Kuiper	afbdf2f212	Fix data race accessing referencedNetworkEndpoint.kind Wrapping "kind" into atomic access functions. Fixes #789 PiperOrigin-RevId: 266485501	2019-08-30 17:23:53 -07:00
Rahat Mahmood	863e11ac4d	Implement /proc/net/udp. PiperOrigin-RevId: 266229756	2019-08-29 14:30:41 -07:00
Tamir Duberstein	24ecce5dbf	Export generated linkAddrEntryEntry PiperOrigin-RevId: 266000128	2019-08-28 14:56:33 -07:00
Tamir Duberstein	313c767b00	Populate link address cache at dispatch This allows the stack to learn remote link addresses on incoming packets, reducing the need to ARP to send responses. This also reduces the number of round trips to the system clock, since that may also prove to be performance-sensitive. Fixes #739. PiperOrigin-RevId: 265815816	2019-08-27 18:54:56 -07:00
Rahat Mahmood	1fdefd41c5	netstack/tcp: Add LastAck transition. Add missing state transition to LastAck, which should happen when the endpoint has already recieved a FIN from the remote side, and is sending its own FIN. PiperOrigin-RevId: 265568314	2019-08-26 16:39:13 -07:00
gVisor bot	7206202bb9	Merge pull request #696 from xiaobo55x:tcpip_link PiperOrigin-RevId: 265534854	2019-08-26 14:03:30 -07:00
Chris Kuiper	ac2200b8a9	Prevent a network endpoint to send/rcv if its address was removed This addresses the problem where an endpoint has its address removed but still has outstanding references held by routes used in connected TCP/UDP sockets which prevent the removal of the endpoint. The fix adds a new "expired" flag to the referenced network endpoint, which is set when an endpoint has its address removed. Incoming packets are not delivered to an expired endpoint (unless in promiscuous mode), while sending outgoing packets triggers an error to the caller (unless in spoofing mode). In addition, a few helper functions were added to stack_test.go to reduce code duplications. PiperOrigin-RevId: 265514326	2019-08-26 12:29:47 -07:00
Tamir Duberstein	e75a12e89d	Implement fmt.Stringer on Route by value This is more convenient, since it implements the interface for both value and pointer. PiperOrigin-RevId: 265086510	2019-08-23 10:44:11 -07:00
Chris Kuiper	8d9276ed56	Support binding to multicast and broadcast addresses This fixes the issue of not being able to bind to either a multicast or broadcast address as well as to send and receive data from it. The way to solve this is to treat these addresses similar to the ANY address and register their transport endpoint ID with the global stack's demuxer rather than the NIC's. That way there is no need to require an endpoint with that multicast or broadcast address. The stack's demuxer is in fact the only correct one to use, because neither broadcast- nor multicast-bound sockets care which NIC a packet was received on (for multicast a join is still needed to receive packets on a NIC). I also took the liberty of refactoring udp_test.go to consolidate a lot of duplicate code and make it easier to create repetitive tests that test the same feature for a variety of packet and socket types. For this purpose I created a "flowType" that represents two things: 1) the type of packet being sent or received and 2) the type of socket used for the test. E.g., a "multicastV4in6" flow represents a V4-mapped multicast packet run through a V6-dual socket. This allows writing significantly simpler tests. A nice example is testTTL(). PiperOrigin-RevId: 264766909	2019-08-21 22:54:25 -07:00
Tamir Duberstein	573e6e4bba	Use tcpip.Subnet in tcpip.Route This is the first step in replacing some of the redundant types with the standard library equivalents. PiperOrigin-RevId: 264706552	2019-08-21 15:31:18 -07:00
Chris Kuiper	7e79ca0225	Add tcpip.Route.String and tcpip.AddressMask.Prefix PiperOrigin-RevId: 264544163	2019-08-20 23:28:52 -07:00
gVisor bot	3ffbdffd7e	Internal change. PiperOrigin-RevId: 264218306	2019-08-19 12:43:22 -07:00
Andrei Vagin	3e4102b2ea	netstack: disconnect an unix socket only if the address family is AF_UNSPEC Linux allows to call connect for ANY and the zero port. PiperOrigin-RevId: 263892534	2019-08-16 19:32:14 -07:00
Chris Kuiper	f7114e0a27	Add subnet checking to NIC.findEndpoint and consolidate with NIC.getRef This adds the same logic to NIC.findEndpoint that is already done in NIC.getRef. Since this makes the two functions very similar they were combined into one with the originals being wrappers. PiperOrigin-RevId: 263864708	2019-08-16 15:58:58 -07:00
Tamir Duberstein	fe74bba2bd	Don't dereference errors passed to panic() These errors are always pointers; there's no sense in dereferencing them in the panic call. Changed one false positive for clarity. PiperOrigin-RevId: 263611579	2019-08-15 11:58:16 -07:00
Tamir Duberstein	816a9211e9	netstack: move resumption logic into _state.go `13a98df` rearranged some of this code in a way that broke compilation of the netstack-only export at github.com/google/netstack because _state.go files are not included in that export. This commit moves resumption logic back into *_state.go, fixing the compilation breakage. PiperOrigin-RevId: 263601629	2019-08-15 11:13:46 -07:00
Haibo Xu	1b1e39d7a1	Enabling pkg/tcpip/link support on arm64. Signed-off-by: Haibo Xu haibo.xu@arm.com Change-Id: Ib6b4aa2db19032e58bf0395f714e6883caee460a	2019-08-15 03:19:30 +00:00
Haibo Xu	52843719ca	Rename fdbased/mmap.go to fdbased/mmap_stub.go. Signed-off-by: Haibo Xu haibo.xu@arm.com Change-Id: Id4489554b9caa332695df8793d361f8332f6a13b	2019-08-15 03:19:22 +00:00
Haibo Xu	0624858593	Rename rawfile/blockingpoll_unsafe.go to rawfile/blockingpoll_stub_unsafe.go. Signed-off-by: Haibo Xu haibo.xu@arm.com Change-Id: I2376e502c1a860d5e624c8a8e3afab5da4c53022	2019-08-15 03:19:14 +00:00
Tamir Duberstein	d81d94ac4c	Replace uinptr with int64 when returning lengths This is in accordance with newer parts of the standard library. PiperOrigin-RevId: 263449916	2019-08-14 16:05:56 -07:00
Tamir Duberstein	69d1414a32	Add tcpip.AddressWithPrefix.String PiperOrigin-RevId: 263436592	2019-08-14 15:02:14 -07:00
Bhasker Hariharan	570fb1db6b	Improve SendMsg performance. SendMsg before this change would copy all the data over into a new slice even if the underlying socket could only accept a small amount of data. This is really inefficient with non-blocking sockets and under high throughput where large writes could get ErrWouldBlock or if there was say a timeout associated with the sendmsg() syscall. With this change we delay copying bytes in till they are needed and only copy what can be potentially sent/held in the socket buffer. Reducing the need to repeatedly copy data over. Also a minor fix to change state FIN-WAIT-1 when shutdown(..., SHUT_WR) is called instead of when we transmit the actual FIN. Otherwise the socket could remain in CONNECTED state even though the user has called shutdown() on the socket. Updates #627 PiperOrigin-RevId: 263430505	2019-08-14 14:34:27 -07:00
Ian Gudger	99bf75a6dc	gonet: Replace NewPacketConn with DialUDP. This better matches the standard library and allows creating connected PacketConns. PiperOrigin-RevId: 263187462	2019-08-13 12:11:09 -07:00
Ian Gudger	eac690e358	Fix netstack build error on non-AMD64. This stub had the wrong function signature. PiperOrigin-RevId: 262992682	2019-08-12 13:31:16 -07:00
Bhasker Hariharan	5a38eb120a	Add congestion control states to sender. This change just introduces different congestion control states and ensures the sender.state is updated to reflect the current state of the connection. It is not used for any decisions yet but this is required before algorithms like Eiffel/PRR can be implemented. Fixes #394 PiperOrigin-RevId: 262638292	2019-08-09 14:50:30 -07:00
Rahat Mahmood	13a98df49e	netstack: Don't start endpoint goroutines too soon on restore. Endpoint protocol goroutines were previously started as part of loading the endpoint. This is potentially too soon, as resources used by these goroutine may not have been loaded. Protocol goroutines may perform meaningful work as soon as they're started (ex: incoming connect) which can cause them to indirectly access resources that haven't been loaded yet. This CL defers resuming all protocol goroutines until the end of restore. PiperOrigin-RevId: 262409429	2019-08-08 12:33:11 -07:00
Tamir Duberstein	67a3f4039d	Set target address in ARP Reply PiperOrigin-RevId: 262163794	2019-08-07 10:27:43 -07:00
Bhasker Hariharan	dfbc0b0a4c	Fix for a panic due to writing to a closed accept channel. This can happen because endpoint.Close() closes the accept channel first and then drains/resets any accepted but not delivered connections. But there can be connections that are connected but not delivered to the channel as the channel was full. But closing the channel can cause these writes to fail with a write to a closed channel. The correct solution is to abort any connections in SYN-RCVD state and drain/abort all completed connections before closing the accept channel. PiperOrigin-RevId: 261951132	2019-08-06 11:01:27 -07:00
Kevin Krakauer	810cc07aab	Plumbing for iptables sockopts. PiperOrigin-RevId: 261413396	2019-08-02 16:26:48 -07:00
Rahat Mahmood	2906dffcdb	Automated rollback of changelist 261191548 PiperOrigin-RevId: 261373749	2019-08-02 12:52:40 -07:00
Rahat Mahmood	79511e8a50	Implement getsockopt(TCP_INFO). Export some readily-available fields for TCP_INFO and stub out the rest. PiperOrigin-RevId: 261191548	2019-08-01 13:58:48 -07:00
Austin Kiekintveld	12c4eb294a	Fix ICMPv4 EchoReply packet checksum The checksum was not being reset before being re-calculated and sent out. This caused the sent checksum to always be `0x0800`. Fixes #605. PiperOrigin-RevId: 260965059	2019-07-31 11:26:41 -07:00
Tamir Duberstein	c6e6d92cb1	Test connecting UDP sockets to the ANY address This doesn't currently pass on gVisor. While I'm here, fix a bug where connecting to the v6-mapped v4 address doesn't work in gVisor. PiperOrigin-RevId: 260923961	2019-07-31 07:41:20 -07:00
Tamir Duberstein	7369c63e42	Pass ProtocolAddress instead of its fields PiperOrigin-RevId: 260803517	2019-07-30 15:06:39 -07:00
Haibo Xu	1decf76471	Change syscall.POLL to syscall.PPOLL. syscall.POLL is not supported on arm64, using syscall.PPOLL to support both the x86 and arm64. refs #63 Signed-off-by: Haibo Xu <haibo.xu@arm.com> Change-Id: I2c81a063d3ec4e7e6b38fe62f17a0924977f505e COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/543 from xiaobo55x:master ba598263fd3748d1addd48e4194080aa12085164 PiperOrigin-RevId: 260752049	2019-07-30 11:01:29 -07:00
Chris Kuiper	40e682759f	Add support for a subnet prefix length on interface network addresses This allows the user code to add a network address with a subnet prefix length. The prefix length value is stored in the network endpoint and provided back to the user in the ProtocolAddress type. PiperOrigin-RevId: 259807693	2019-07-24 13:42:14 -07:00
Tamir Duberstein	12c256568b	Deduplicate EndpointState.connected some This fixes a bug introduced in cl/251934850 that caused connect-accept-close-connect races to result in the second connect call failiing when it should have succeeded. PiperOrigin-RevId: 259584525	2019-07-23 12:10:18 -07:00
Chris Kuiper	0e040ba6e8	Handle interfaceAddr and NIC options separately for IP_MULTICAST_IF This tweaks the handling code for IP_MULTICAST_IF to ignore the InterfaceAddr if a NICID is given. PiperOrigin-RevId: 258982541	2019-07-19 09:29:04 -07:00
Andrei Vagin	eefa817cfd	net/tcp/setockopt: impelment setsockopt(fd, SOL_TCP, TCP_INQ) PiperOrigin-RevId: 258859507	2019-07-18 15:41:04 -07:00
gVisor bot	74dc663bbb	Internal change. PiperOrigin-RevId: 258424489	2019-07-16 13:03:37 -07:00
Kevin Krakauer	9b4d3280e1	Add IPPROTO_RAW, which allows raw sockets to write IP headers. iptables also relies on IPPROTO_RAW in a way. It opens such a socket to manipulate the kernel's tables, but it doesn't actually use any of the functionality. Blegh. PiperOrigin-RevId: 257903078	2019-07-12 18:09:12 -07:00
Tamir Duberstein	17bab652af	Check that IP headers contain correct version PiperOrigin-RevId: 257888338	2019-07-12 16:19:18 -07:00
Bhasker Hariharan	6116473b2f	Stub out support for TCP_MAXSEG. Adds support to set/get the TCP_MAXSEG value but does not really change the segment sizes emitted by netstack or alter the MSS advertised by the endpoint. This is currently being added only to unblock iperf3 on gVisor. Plumbing this correctly requires a bit more work which will come in separate CLs. PiperOrigin-RevId: 257859112	2019-07-12 13:35:17 -07:00
Andrei Vagin	116cac053e	netstack/udp: connect with the AF_UNSPEC address family means disconnect PiperOrigin-RevId: 256433283	2019-07-03 14:19:02 -07:00
gVisor bot	d60ae0ddee	Merge pull request #279 from kevinGC:iptables-1-pkg PiperOrigin-RevId: 256231055	2019-07-02 13:48:06 -07:00
Michael Pratt	5b41ba5d0e	Fix various spelling issues in the documentation Addresses obvious typos, in the documentation only. COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/443 from Pixep:fix/documentation-spelling 4d0688164eafaf0b3010e5f4824b35d1e7176d65 PiperOrigin-RevId: 255477779	2019-06-27 14:25:50 -07:00
Bhasker Hariharan	c1761378a9	Fix the logic for sending zero window updates. Today we have the logic split in two places between endpoint Read() and the worker goroutine which actually sends a zero window. This change makes it so that when a zero window ACK is sent we set a flag in the endpoint which can be read by the endpoint to decide if it should notify the worker to send a nonZeroWindow update. The worker now does not do the check again but instead sends an ACK and flips the flag right away. Similarly today when SO_RECVBUF is set the SetSockOpt call has logic to decide if a zero window update is required. Rather than do that we move the logic to the worker goroutine and it can check the zeroWindow flag and send an update if required. PiperOrigin-RevId: 254505447	2019-06-21 18:31:31 -07:00
Brad Burlage	ae4ef32b8c	Deflake TestSimpleReceive failures due to timeouts This test will occasionally fail waiting to read a packet. From repeated runs, I've seen it up to 1.5s for waitForPackets to complete. PiperOrigin-RevId: 254484627	2019-06-21 15:56:12 -07:00
Bhasker Hariharan	3d71c627fa	Add support for TCP receive buffer auto tuning. The implementation is similar to linux where we track the number of bytes consumed by the application to grow the receive buffer of a given TCP endpoint. This ensures that the advertised window grows at a reasonable rate to accomodate for the sender's rate and prevents large amounts of data being held in stack buffers if the application is not actively reading or not reading fast enough. The original paper that was used to implement the linux receive buffer auto- tuning is available @ https://public.lanl.gov/radiant/pubs/drs/lacsi2001.pdf NOTE: Linux does not implement DRS as defined in that paper, it's just a good reference to understand the solution space. Updates #230 PiperOrigin-RevId: 253168283	2019-06-13 22:28:01 -07:00
Adin Scannell	add40fd6ad	Update canonical repository. This can be merged after: https://github.com/google/gvisor-website/pull/77 or https://github.com/google/gvisor-website/pull/78 PiperOrigin-RevId: 253132620	2019-06-13 16:50:15 -07:00
Adin Scannell	e352f46478	Minor BUILD file cleanup. PiperOrigin-RevId: 252918338	2019-06-12 15:59:46 -07:00
Kevin Krakauer	0bbbcafd68	Merge branch 'master' into iptables-1-pkg Change-Id: I7457a11de4725e1bf3811420c505d225b1cb6943	2019-06-12 15:21:22 -07:00
Bhasker Hariharan	70578806e8	Add support for TCP_CONGESTION socket option. This CL also cleans up the error returned for setting congestion control which was incorrectly returning EINVAL instead of ENOENT. PiperOrigin-RevId: 252889093	2019-06-12 13:35:50 -07:00
Bhasker Hariharan	3933dd5c04	Fixes to listen backlog handling. Changes netstack to confirm to current linux behaviour where if the backlog is full then we drop the SYN and do not send a SYN-ACK. Similarly we allow upto backlog connections to be in SYN-RCVD state as long as the backlog is not full. We also now drop a SYN if syn cookies are in use and the backlog for the listening endpoint is full. Added new tests to confirm the behaviour. Also reverted the change to increase the backlog in TcpPortReuseMultiThread syscall test. Fixes #236 PiperOrigin-RevId: 252500462	2019-06-10 15:40:44 -07:00
Kevin Krakauer	06a83df533	Address more comments. Change-Id: I83ae1079f3dcba6b018f59ab7898decab5c211d2	2019-06-10 12:43:54 -07:00
Kevin Krakauer	8afbd974da	Address Ian's comments. Change-Id: I7445033b1970cbba3f2ed0682fe520dce02d8fad	2019-06-07 12:54:53 -07:00
Rahat Mahmood	2d2831e354	Track and export socket state. This is necessary for implementing network diagnostic interfaces like /proc/net/{tcp,udp,unix} and sock_diag(7). For pass-through endpoints such as hostinet, we obtain the socket state from the backend. For netstack, we add explicit tracking of TCP states. PiperOrigin-RevId: 251934850	2019-06-06 15:04:47 -07:00
Bhasker Hariharan	85be01b42d	Add multi-fd support to fdbased endpoint. This allows an fdbased endpoint to have multiple underlying fd's from which packets can be read and dispatched/written to. This should allow for higher throughput as well as better scalability of the network stack as number of connections increases. Updates #231 PiperOrigin-RevId: 251852825	2019-06-06 08:07:02 -07:00
Andrei Vagin	79f7cb6c1c	netstack/sniffer: log GSO attributes PiperOrigin-RevId: 251788534	2019-06-05 22:51:53 -07:00
Andrei Vagin	a12848ffeb	netstack/tcp: fix calculating a number of outstanding packets In case of GSO, a segment can container more than one packet and we need to use the pCount() helper to get a number of packets. PiperOrigin-RevId: 251743020	2019-06-05 16:30:45 -07:00
Chris Kuiper	d18bb4f38a	Adjust route when looping multicast packets Multicast packets are special in that their destination address does not identify a specific interface. When sending out such a packet the multicast address is the remote address, but for incoming packets it is the local address. Hence, when looping a multicast packet, the route needs to be tweaked to reflect this. PiperOrigin-RevId: 251739298	2019-06-05 16:08:29 -07:00
Bhasker Hariharan	e0fb921205	Fix data race in synRcvdState. When checking the length of the acceptedChan we should hold the endpoint mutex otherwise a syn received while the listening socket is being closed can result in a data race where the cleanupLocked routine sets acceptedChan to nil while a handshake goroutine in progress could try and check it at the same time. PiperOrigin-RevId: 251537697	2019-06-04 16:17:24 -07:00
Bhasker Hariharan	bfe3220992	Delete debug log lines left by mistake. Updates #236 PiperOrigin-RevId: 251337915	2019-06-03 17:00:18 -07:00
Bhasker Hariharan	3577a4f691	Disable certain tests that are flaky under race detector. PiperOrigin-RevId: 250976665	2019-05-31 16:19:49 -07:00
Bhasker Hariharan	033f96cc93	Change segment queue limit to be of fixed size. Netstack sets the unprocessed segment queue size to match the receive buffer size. This is not required as this queue only needs to hold enough for a short duration before the endpoint goroutine can process it. Updates #230 PiperOrigin-RevId: 250976323	2019-05-31 16:17:33 -07:00
Kevin Krakauer	d58eb9ce82	Add basic iptables structures to netstack. Change-Id: Ib589906175a59dae315405a28f2d7f525ff8877f	2019-05-31 16:14:04 -07:00
Fabricio Voznika	38de91b028	Add build guard to files using go:linkname Funcion signatures are not validated during compilation. Since they are not exported, they can change at any time. The guard ensures that they are verified at least on every version upgrade. PiperOrigin-RevId: 250733742	2019-05-30 12:09:39 -07:00
Bhasker Hariharan	ae26b2c425	Fixes to TCP listen behavior. Netstack listen loop can get stuck if cookies are in-use and the app is slow to accept incoming connections. Further we continue to complete handshake for a connection even if the backlog is full. This creates a problem when a lots of connections come in rapidly and we end up with lots of completed connections just hanging around to be delivered. These fixes change netstack behaviour to mirror what linux does as described here in the following article http://veithen.io/2014/01/01/how-tcp-backlog-works-in-linux.html Now when cookies are not in-use Netstack will silently drop the ACK to a SYN-ACK and not complete the handshake if the backlog is full. This will result in the connection staying in a half-complete state. Eventually the sender will retransmit the ACK and if backlog has space we will transition to a connected state and deliver the endpoint. Similarly when cookies are in use we do not try and create an endpoint unless there is space in the accept queue to accept the newly created endpoint. If there is no space then we again silently drop the ACK as we can just recreate it when the ACK is retransmitted by the peer. We also now use the backlog to cap the size of the SYN-RCVD queue for a given endpoint. So at any time there can be N connections in the backlog and N in a SYN-RCVD state if the application is not accepting connections. Any new SYNs will be dropped. This CL also fixes another small bug where we mark a new endpoint which has not completed handshake as connected. We should wait till handshake successfully completes before marking it connected. Updates #236 PiperOrigin-RevId: 250717817	2019-05-30 12:08:41 -07:00
Tamir Duberstein	e4b395db49	Remove unused wakers These wakers are uselessly allocated and passed around; nothing ever listens for notifications on them. The code here appears to be vestigial, so removing it and allowing a nil waker to be passed seems appropriate. PiperOrigin-RevId: 249879320 Change-Id: Icd209fb77cc0dd4e5c49d7a9f2adc32bf88b4b71	2019-05-24 12:29:14 -07:00
Kevin Krakauer	c1cdf18e7b	UDP and TCP raw socket support. PiperOrigin-RevId: 249511348 Change-Id: I34539092cc85032d9473ff4dd308fc29dc9bfd6b	2019-05-22 13:45:15 -07:00
Bhasker Hariharan	2ac0aeeb42	Refactor fdbased endpoint dispatcher code. This is in preparation to support an fdbased endpoint that can read/dispatch packets from multiple underlying fds. Updates #231 PiperOrigin-RevId: 249337074 Change-Id: Id7d375186cffcf55ae5e38986e7d605a96916d35	2019-05-21 15:24:25 -07:00
Nicolas Lacasse	bfd9f75ba4	Set the FilesytemType in MountSource from the Filesystem. And stop storing the Filesystem in the MountSource. This allows us to decouple the MountSource filesystem type from the name of the filesystem. PiperOrigin-RevId: 247292982 Change-Id: I49cbcce3c17883b7aa918ba76203dfd6d1b03cc8	2019-05-08 14:35:06 -07:00
Googler	cbf6ab9697	Check GSO for nil in WritePacket Testing: Unit tests added PiperOrigin-RevId: 247096269 Change-Id: I849c010eadcb53caf45896a15ef38162d66a9568	2019-05-07 14:57:03 -07:00

1 2 3 4 5 ...

348 Commits