gvisor

Commit Graph

Author	SHA1	Message	Date
Bhasker Hariharan	3d71c627fa	Add support for TCP receive buffer auto tuning. The implementation is similar to linux where we track the number of bytes consumed by the application to grow the receive buffer of a given TCP endpoint. This ensures that the advertised window grows at a reasonable rate to accomodate for the sender's rate and prevents large amounts of data being held in stack buffers if the application is not actively reading or not reading fast enough. The original paper that was used to implement the linux receive buffer auto- tuning is available @ https://public.lanl.gov/radiant/pubs/drs/lacsi2001.pdf NOTE: Linux does not implement DRS as defined in that paper, it's just a good reference to understand the solution space. Updates #230 PiperOrigin-RevId: 253168283	2019-06-13 22:28:01 -07:00
Adin Scannell	add40fd6ad	Update canonical repository. This can be merged after: https://github.com/google/gvisor-website/pull/77 or https://github.com/google/gvisor-website/pull/78 PiperOrigin-RevId: 253132620	2019-06-13 16:50:15 -07:00
Bhasker Hariharan	3933dd5c04	Fixes to listen backlog handling. Changes netstack to confirm to current linux behaviour where if the backlog is full then we drop the SYN and do not send a SYN-ACK. Similarly we allow upto backlog connections to be in SYN-RCVD state as long as the backlog is not full. We also now drop a SYN if syn cookies are in use and the backlog for the listening endpoint is full. Added new tests to confirm the behaviour. Also reverted the change to increase the backlog in TcpPortReuseMultiThread syscall test. Fixes #236 PiperOrigin-RevId: 252500462	2019-06-10 15:40:44 -07:00
Rahat Mahmood	2d2831e354	Track and export socket state. This is necessary for implementing network diagnostic interfaces like /proc/net/{tcp,udp,unix} and sock_diag(7). For pass-through endpoints such as hostinet, we obtain the socket state from the backend. For netstack, we add explicit tracking of TCP states. PiperOrigin-RevId: 251934850	2019-06-06 15:04:47 -07:00
Bhasker Hariharan	e0fb921205	Fix data race in synRcvdState. When checking the length of the acceptedChan we should hold the endpoint mutex otherwise a syn received while the listening socket is being closed can result in a data race where the cleanupLocked routine sets acceptedChan to nil while a handshake goroutine in progress could try and check it at the same time. PiperOrigin-RevId: 251537697	2019-06-04 16:17:24 -07:00
Bhasker Hariharan	ae26b2c425	Fixes to TCP listen behavior. Netstack listen loop can get stuck if cookies are in-use and the app is slow to accept incoming connections. Further we continue to complete handshake for a connection even if the backlog is full. This creates a problem when a lots of connections come in rapidly and we end up with lots of completed connections just hanging around to be delivered. These fixes change netstack behaviour to mirror what linux does as described here in the following article http://veithen.io/2014/01/01/how-tcp-backlog-works-in-linux.html Now when cookies are not in-use Netstack will silently drop the ACK to a SYN-ACK and not complete the handshake if the backlog is full. This will result in the connection staying in a half-complete state. Eventually the sender will retransmit the ACK and if backlog has space we will transition to a connected state and deliver the endpoint. Similarly when cookies are in use we do not try and create an endpoint unless there is space in the accept queue to accept the newly created endpoint. If there is no space then we again silently drop the ACK as we can just recreate it when the ACK is retransmitted by the peer. We also now use the backlog to cap the size of the SYN-RCVD queue for a given endpoint. So at any time there can be N connections in the backlog and N in a SYN-RCVD state if the application is not accepting connections. Any new SYNs will be dropped. This CL also fixes another small bug where we mark a new endpoint which has not completed handshake as connected. We should wait till handshake successfully completes before marking it connected. Updates #236 PiperOrigin-RevId: 250717817	2019-05-30 12:08:41 -07:00
Bhasker Hariharan	458fe955a7	Implement support for SACK based recovery(RFC 6675). PiperOrigin-RevId: 246536003 Change-Id: I118b745f45040be9c70cb6a1028acdb06c78d8c9	2019-05-03 10:51:18 -07:00
Michael Pratt	4d52a55201	Change copyright notice to "The gVisor Authors" Based on the guidelines at https://opensource.google.com/docs/releasing/authors/. 1. $ rg -l "Google LLC" \| xargs sed -i 's/Google LLC.*/The gVisor Authors./' 2. Manual fixup of "Google Inc" references. 3. Add AUTHORS file. Authors may request to be added to this file. 4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS. Fixes #209 PiperOrigin-RevId: 245823212 Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9	2019-04-29 14:26:23 -07:00
Bhasker Hariharan	eaac2806ff	Add TCP checksum verification. PiperOrigin-RevId: 242704699 Change-Id: I87db368ca343b3b4bf4f969b17d3aa4ce2f8bd4f	2019-04-09 11:23:47 -07:00
Andrei Vagin	f4105ac21a	netstack/fdbased: add generic segmentation offload (GSO) support The linux packet socket can handle GSO packets, so we can segment packets to 64K instead of the MTU which is usually 1500. Here are numbers for the nginx-1m test: runsc: 579330.01 [Kbytes/sec] received runsc-gso: 1794121.66 [Kbytes/sec] received runc: 2122139.06 [Kbytes/sec] received and for tcp_benchmark: $ tcp_benchmark --duration 15 --ideal [ 4] 0.0-15.0 sec 86647 MBytes 48456 Mbits/sec $ tcp_benchmark --client --duration 15 --ideal [ 4] 0.0-15.0 sec 2173 MBytes 1214 Mbits/sec $ tcp_benchmark --client --duration 15 --ideal --gso 65536 [ 4] 0.0-15.0 sec 19357 MBytes 10825 Mbits/sec PiperOrigin-RevId: 240809103 Change-Id: I2637f104db28b5d4c64e1e766c610162a195775a	2019-03-28 11:03:41 -07:00
Andrei Vagin	654e878abb	netstack: Don't exclude length when a pseudo-header checksum is calculated This is a preparation for GSO changes (cl/234508902). RELNOTES[gofers]: Refactor checksum code to include length, which it already did, but in a convoluted way. Should be a no-op. PiperOrigin-RevId: 240460794 Change-Id: I537381bc670b5a9f5d70a87aa3eb7252e8f5ace2	2019-03-26 17:15:13 -07:00
Tamir Duberstein	5496be7c5d	Remove duplicate TCP flag definitions PiperOrigin-RevId: 238467634 Change-Id: If4cd8efff7386fbee1195f051d15549b495910a9	2019-03-14 10:19:21 -07:00
Bhasker Hariharan	efe5e737d7	Do not drop packets w/ missing TCP timestamps. RFC7323 recommends that if the timestamp option was negotiated then all packets should carry a TCP Timestamp and any packets that do not should be dropped. Netstack implemented this behaviour. Linux OTOH does not and will accept such packets. This change makes Netstack behaviour compatible with Linux. Also now that we allow such packets, we do need to update RTO calculations based on these packets even if timestamp option is enabled. PiperOrigin-RevId: 233432268 Change-Id: I9f4742ae6b63930ac3b5e37d8c238761e6a4b29f	2019-02-11 10:23:43 -08:00
Ian Gudger	6253d32cc9	transport/tcp: remove unused error return values PiperOrigin-RevId: 225421480 Change-Id: I1e9259b0b7e8490164e830b73338a615129c7f0e	2018-12-13 13:02:49 -08:00
Zhaozhong Ni	fda4557e3d	sentry: skip waiting for undrain for netstack TCP endpoints in error state. PiperOrigin-RevId: 224214981 Change-Id: I4c1dd5b1c856f7a4f9866a5dda44a5297e92486a	2018-12-05 13:51:16 -08:00
Ian Gudger	37cbce1f91	Merge segments in sender's writeList PiperOrigin-RevId: 220185891 Change-Id: Iaea73fd7b2fa8c399b989cdcaabf4885f370df4b	2018-11-05 15:39:30 -08:00
Ian Gudger	59b7766af7	Fix a race where keepalives could be sent while there is pending data PiperOrigin-RevId: 219571556 Change-Id: I5a1042c1cb05eb2711eb01627fd298bad6c543a6	2018-10-31 18:42:44 -07:00
Ian Gudger	8fce67af24	Use correct company name in copyright header PiperOrigin-RevId: 217951017 Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035	2018-10-19 16:35:11 -07:00
Sepehr Raissian	c17ea8c6e2	Block for link address resolution Previously, if address resolution for UDP or Ping sockets required sending packets using Write in Transport layer, Resolve would return ErrWouldBlock and Write would return ErrNoLinkAddress. Meanwhile startAddressResolution would run in background. Further calls to Write using same address would also return ErrNoLinkAddress until resolution has been completed successfully. Since Write is not allowed to block and System Calls need to be interruptible in System Call layer, the caller to Write is responsible for blocking upon return of ErrWouldBlock. Now, when startAddressResolution is called a notification channel for the completion of the address resolution is returned. The channel will traverse up to the calling function of Write as well as ErrNoLinkAddress. Once address resolution is complete (success or not) the channel is closed. The caller would call Write again to send packets and check if address resolution was compeleted successfully or not. Fixes google/gvisor#5 Change-Id: Idafaf31982bee1915ca084da39ae7bd468cebd93 PiperOrigin-RevId: 214962200	2018-09-28 11:00:16 -07:00
Tamir Duberstein	d7a05b4e63	Pass buffer.Prependable by value PiperOrigin-RevId: 213053370 Change-Id: I60ea89572b4fca53fd126c870fcbde74fcf52562	2018-09-14 15:23:58 -07:00
Tamir Duberstein	5adb3468d4	Add multicast support PiperOrigin-RevId: 212750821 Change-Id: I822fd63e48c684b45fd91f9ce057867b7eceb792	2018-09-12 20:39:24 -07:00
Bert Muthalaly	5685d6b5ad	Update {LinkEndpoint,NetworkEndpoint}#WritePacket to take a VectorisedView Makes it possible to avoid copying or allocating in cases where DeliverNetworkPacket (rx) needs to turn around and call WritePacket (tx) with its VectorisedView. Also removes the restriction on having VectorisedViews with multiple views in the write path. PiperOrigin-RevId: 211728717 Change-Id: Ie03a65ecb4e28bd15ebdb9c69f05eced18fdfcff	2018-09-05 17:34:25 -07:00
Tamir Duberstein	bc5e18c9d1	Implement TCP keepalives PiperOrigin-RevId: 211670620 Change-Id: Ia8a3d8ae53a7fece1dee08ee9c74964bd7f71bb7	2018-09-05 11:48:23 -07:00
Tamir Duberstein	0923bcf06b	Add various statistics PiperOrigin-RevId: 210442599 Change-Id: I9498351f461dc69c77b7f815d526c5693bec8e4a	2018-08-27 15:29:55 -07:00
Ian Gudger	abe7764928	Encapsulate netstack metrics PiperOrigin-RevId: 209943212 Change-Id: I96dcbc7c2ab2426e510b94a564436505256c5c79	2018-08-23 08:55:23 -07:00
Zhaozhong Ni	0a55f8c1c1	netstack: support disconnect-on-save option per fdbased link. PiperOrigin-RevId: 206659972 Change-Id: I5e0e035f97743b6525ad36bed2c802791609beaf	2018-07-30 15:43:25 -07:00
Zhaozhong Ni	cc34a90fb4	netstack: do not defer panicable logic in tcp main loop. PiperOrigin-RevId: 204355026 Change-Id: I1a8229879ea3b58aa861a4eb4456fd7aff99863d	2018-07-12 13:39:28 -07:00
Zhaozhong Ni	b1683df90b	netstack: tcp socket connected state S/R support. PiperOrigin-RevId: 203958972 Change-Id: Ia6fe16547539296d48e2c6731edacdd96bd6e93c	2018-07-10 09:23:35 -07:00
Brian Geffon	da9b5153f2	Fix two race conditions in tcp stack. PiperOrigin-RevId: 203880278 Change-Id: I66b790a616de59142859cc12db4781b57ea626d3	2018-07-09 20:48:27 -07:00
Nicolas Lacasse	bf0fa09537	Switch netstack licenses to Apache 2.0. Fixes #27 PiperOrigin-RevId: 203825288 Change-Id: Ie9f3a2b2c1e296b026b024f75c07da1a7e118633	2018-07-09 14:04:40 -07:00
Brian Geffon	23f49097c7	Panic in netstack during cleanup where a FIN becomes a RST. There is a subtle bug where during cleanup with unread data a FIN can be converted to a RST, at that point the entire connection should be aborted as we're not expecting any ACKs to the RST. PiperOrigin-RevId: 202691271 Change-Id: Idae70800208ca26e07a379bc6b2b8090805d0a22	2018-06-29 12:40:26 -07:00
Brian Geffon	51c1e510ab	Automated rollback of changelist 201596247 PiperOrigin-RevId: 202151720 Change-Id: I0491172c436bbb32b977f557953ba0bc41cfe299	2018-06-26 10:33:24 -07:00
Zhaozhong Ni	0e434b66a6	netstack: tcp socket connected state S/R support. PiperOrigin-RevId: 201596247 Change-Id: Id22f47b2cdcbe14aa0d930f7807ba75f91a56724	2018-06-21 15:19:45 -07:00
Michael Pratt	bd2d1aaa16	Replace crypto/rand with internal rand package PiperOrigin-RevId: 200784607 Change-Id: I39aa6ee632936dcbb00fc298adccffa606e9f4c0	2018-06-15 15:36:00 -07:00
Zhaozhong Ni	343020ca27	netstack: make TCP endpoint closed and error state cleanup work synchronous. So that when saving TCP endpoint in these states, there is no pending or background activities. Also lift tcp network save rejection error to tcpip package. PiperOrigin-RevId: 199370748 Change-Id: Ief7b45c2a7338d12414cd7c23db95de6a9c22700	2018-06-05 15:44:38 -07:00
Fabricio Voznika	c5dc873e44	Automated rollback of changelist 196886839 PiperOrigin-RevId: 198457660 Change-Id: I6ea5cf0b4cfe2b5ba455325a7e5299880e5a088a	2018-05-29 14:24:07 -07:00
Brian Geffon	a8b90a7158	Poll should wake up on ECONNREFUSED with no mask. Today poll will not wake up on a ECONNREFUSED if no poll mask is specified, which is equivalent to POLLHUP \| POLLERR which are implicitly added during the poll syscall. PiperOrigin-RevId: 197967183 Change-Id: I668d0730c33701228913f2d0843b48491b642efb	2018-05-24 15:46:50 -07:00
Ian Gudger	02ad0dc3d9	Fix typo in TCP transport PiperOrigin-RevId: 197789418 Change-Id: I86b1574c8d3b8b321348d9b101ffaef7aa15f722	2018-05-23 14:28:43 -07:00
Zhaozhong Ni	5b4c20e1b8	netstack: make TCP endpoint closed and error state cleanup work synchronous. So that when saving TCP endpoint in these states, there is no pending or background activities. Also lift tcp network save rejection error to tcpip package. PiperOrigin-RevId: 196886839 Change-Id: I0fe73750f2743ec7e62d139eb2cec758c5dd6698	2018-05-16 14:15:24 -07:00
Zhaozhong Ni	987f7841a6	netstack: TCP connecting state endpoint save / restore support. PiperOrigin-RevId: 196325647 Change-Id: I850eb4a29b9c679da4db10eb164bbdf967690663	2018-05-11 16:28:39 -07:00
Googler	d02b74a5dc	Check in gVisor. PiperOrigin-RevId: 194583126 Change-Id: Ica1d8821a90f74e7e745962d71801c598c652463	2018-04-28 01:44:26 -04:00

41 Commits