Commit Graph

31 Commits

Author SHA1 Message Date
Andrei Vagin f4105ac21a netstack/fdbased: add generic segmentation offload (GSO) support
The linux packet socket can handle GSO packets, so we can segment packets to
64K instead of the MTU which is usually 1500.

Here are numbers for the nginx-1m test:
runsc:		579330.01 [Kbytes/sec] received
runsc-gso:	1794121.66 [Kbytes/sec] received
runc:		2122139.06 [Kbytes/sec] received

and for tcp_benchmark:

$ tcp_benchmark  --duration 15   --ideal
[  4]  0.0-15.0 sec  86647 MBytes  48456 Mbits/sec

$ tcp_benchmark --client --duration 15   --ideal
[  4]  0.0-15.0 sec  2173 MBytes  1214 Mbits/sec

$ tcp_benchmark --client --duration 15   --ideal --gso 65536
[  4]  0.0-15.0 sec  19357 MBytes  10825 Mbits/sec

PiperOrigin-RevId: 240809103
Change-Id: I2637f104db28b5d4c64e1e766c610162a195775a
2019-03-28 11:03:41 -07:00
Andrei Vagin 9f4e1cb797 netstack: adjust the sequence number after trimming the packet
PiperOrigin-RevId: 239417224
Change-Id: I14a9adc31a6330a79a6156c105969cd5f1f63d20
2019-03-20 09:58:10 -07:00
Andrei Vagin 87cce0ec08 netstack: reduce MSS from SYN to account tcp options
See: https://tools.ietf.org/html/rfc6691#section-2
PiperOrigin-RevId: 239305632
Change-Id: Ie8eb912a43332e6490045dc95570709c5b81855e
2019-03-19 17:33:20 -07:00
Tamir Duberstein 5496be7c5d Remove duplicate TCP flag definitions
PiperOrigin-RevId: 238467634
Change-Id: If4cd8efff7386fbee1195f051d15549b495910a9
2019-03-14 10:19:21 -07:00
Bhasker Hariharan 1718fdd1a8 Add new retransmissions and recovery related metrics.
PiperOrigin-RevId: 236945145
Change-Id: I051760d95154ea5574c8bb6aea526f488af5e07b
2019-03-05 16:41:44 -08:00
Bhasker Hariharan 26be25e4ec Add a SACK scoreboard to TCP endpoints.
This change does not make use of SACK information but adds support to track
SACK information and store it in the endpoint.

The actual SACK based recovery will be in a separate CL.

Part of commits to add RFC 6675 support to Netstack.

PiperOrigin-RevId: 235612264
Change-Id: I261f94844d7bad5abda803152ce6cc6125a467ff
2019-02-25 15:20:04 -08:00
Bhasker Hariharan efe5e737d7 Do not drop packets w/ missing TCP timestamps.
RFC7323 recommends that if the timestamp option was negotiated
then all packets should carry a TCP Timestamp and any packets that
do not should be dropped.

Netstack implemented this behaviour. Linux OTOH does not and will
accept such packets. This change makes Netstack behaviour compatible
with Linux.

Also now that we allow such packets, we do need to update RTO calculations
based on these packets even if timestamp option is enabled.

PiperOrigin-RevId: 233432268
Change-Id: I9f4742ae6b63930ac3b5e37d8c238761e6a4b29f
2019-02-11 10:23:43 -08:00
Ian Gudger 8cbd6153a6 Fix available calculation when merging TCP segments
PiperOrigin-RevId: 224033418
Change-Id: I780be973e8be68ac93e8c9e7a100002e912f40d2
2018-12-04 13:15:25 -08:00
Ian Gudger b5e91eaa52 Clean up tcp.sendData
PiperOrigin-RevId: 221484739
Change-Id: I44c71f79f99d0d00a2e70a7f06d7024a62a5de0a
2018-11-14 11:58:41 -08:00
Ian Gudger 7f60294a73 Implement TCP_NODELAY and TCP_CORK
Previously, TCP_NODELAY was always enabled and we would lie about it being
configurable. TCP_NODELAY is now disabled by default (to match Linux) in the
socket layer so that non-gVisor users don't automatically start using this
questionable optimization.

PiperOrigin-RevId: 221368472
Change-Id: Ib0240f66d94455081f4e0ca94f09d9338b2c1356
2018-11-13 18:02:43 -08:00
Ian Gudger c22da3e705 Remove obsolete TODO
PiperOrigin-RevId: 221117846
Change-Id: I2a43fd8135b1d1194ff81e98644ce6b6182ece50
2018-11-12 10:45:19 -08:00
Ian Gudger 37cbce1f91 Merge segments in sender's writeList
PiperOrigin-RevId: 220185891
Change-Id: Iaea73fd7b2fa8c399b989cdcaabf4885f370df4b
2018-11-05 15:39:30 -08:00
Ian Gudger 59b7766af7 Fix a race where keepalives could be sent while there is pending data
PiperOrigin-RevId: 219571556
Change-Id: I5a1042c1cb05eb2711eb01627fd298bad6c543a6
2018-10-31 18:42:44 -07:00
Ian Gudger 8fce67af24 Use correct company name in copyright header
PiperOrigin-RevId: 217951017
Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
2018-10-19 16:35:11 -07:00
Bhasker Hariharan bd12e95247 Fix RTT estimation when timestamp option is enabled.
From RFC7323#Section-4

The [RFC6298] RTT estimator has weighting factors, alpha and beta, based on an
implicit assumption that at most one RTTM will be sampled per RTT.  When
multiple RTTMs per RTT are available to update the RTT estimator, an
implementation SHOULD try to adhere to the spirit of the history specified in
[RFC6298].  An implementation suggestion is detailed in Appendix G.

From RFC7323#appendix-G
Appendix G.  RTO Calculation Modification

   Taking multiple RTT samples per window would shorten the history calculated
   by the RTO mechanism in [RFC6298], and the below algorithm aims to maintain a
   similar history as originally intended by [RFC6298].

   It is roughly known how many samples a congestion window worth of data will
   yield, not accounting for ACK compression, and ACK losses.  Such events will
   result in more history of the path being reflected in the final value for
   RTO, and are uncritical.  This modification will ensure that a similar amount
   of time is taken into account for the RTO estimation, regardless of how many
   samples are taken per window:

      ExpectedSamples = ceiling(FlightSize / (SMSS * 2))

      alpha' = alpha / ExpectedSamples

      beta' = beta / ExpectedSamples

   Note that the factor 2 in ExpectedSamples is due to "Delayed ACKs".

   Instead of using alpha and beta in the algorithm of [RFC6298], use alpha' and
   beta' instead:

      RTTVAR <- (1 - beta') * RTTVAR + beta' * |SRTT - R'|

      SRTT <- (1 - alpha') * SRTT + alpha' * R'

      (for each sample R')

PiperOrigin-RevId: 213644795
Change-Id: I52278b703540408938a8edb8c38be97b37f4a10e
2018-09-19 09:59:12 -07:00
Bert Muthalaly 5685d6b5ad Update {LinkEndpoint,NetworkEndpoint}#WritePacket to take a VectorisedView
Makes it possible to avoid copying or allocating in cases where DeliverNetworkPacket (rx)
needs to turn around and call WritePacket (tx) with its VectorisedView.

Also removes the restriction on having VectorisedViews with multiple views in the write path.

PiperOrigin-RevId: 211728717
Change-Id: Ie03a65ecb4e28bd15ebdb9c69f05eced18fdfcff
2018-09-05 17:34:25 -07:00
Tamir Duberstein bc5e18c9d1 Implement TCP keepalives
PiperOrigin-RevId: 211670620
Change-Id: Ia8a3d8ae53a7fece1dee08ee9c74964bd7f71bb7
2018-09-05 11:48:23 -07:00
Tamir Duberstein 3794cb6bff Expose TCP RTT
PiperOrigin-RevId: 211504634
Change-Id: I9a7bcbbdd40e5036894930f709278725ef477293
2018-09-04 12:39:47 -07:00
Bhasker Hariharan 56fa562dda Cubic implementation for Netstack.
This CL implements CUBIC as described in https://tools.ietf.org/html/rfc8312.

PiperOrigin-RevId: 207353142
Change-Id: I329cbf3277f91127e99e488f07d906f6779c6603
2018-08-03 17:54:42 -07:00
Zhaozhong Ni 57d0fcbdbf Automated rollback of changelist 207037226
PiperOrigin-RevId: 207125440
Change-Id: I6c572afb4d693ee72a0c458a988b0e96d191cd49
2018-08-02 10:42:48 -07:00
Michael Pratt 60add78980 Automated rollback of changelist 207007153
PiperOrigin-RevId: 207037226
Change-Id: I8b5f1a056d4f3eab17846f2e0193bb737ecb5428
2018-08-01 19:57:32 -07:00
Zhaozhong Ni b9e1cf8404 stateify: convert all packages to use explicit mode.
PiperOrigin-RevId: 207007153
Change-Id: Ifedf1cc3758dc18be16647a4ece9c840c1c636c9
2018-08-01 15:43:24 -07:00
Bhasker Hariharan da48c04d0d Refactor new reno congestion control logic out of sender.
This CL also puts the congestion control logic behind an
interface so that we can easily swap it out for say CUBIC
in the future.

PiperOrigin-RevId: 205732848
Change-Id: I891cdfd17d4d126b658b5faa0c6bd6083187944b
2018-07-23 15:15:07 -07:00
Zhaozhong Ni b1683df90b netstack: tcp socket connected state S/R support.
PiperOrigin-RevId: 203958972
Change-Id: Ia6fe16547539296d48e2c6731edacdd96bd6e93c
2018-07-10 09:23:35 -07:00
Nicolas Lacasse bf0fa09537 Switch netstack licenses to Apache 2.0.
Fixes #27

PiperOrigin-RevId: 203825288
Change-Id: Ie9f3a2b2c1e296b026b024f75c07da1a7e118633
2018-07-09 14:04:40 -07:00
Brian Geffon 23f49097c7 Panic in netstack during cleanup where a FIN becomes a RST.
There is a subtle bug where during cleanup with unread data a FIN can
be converted to a RST, at that point the entire connection should be
aborted as we're not expecting any ACKs to the RST.

PiperOrigin-RevId: 202691271
Change-Id: Idae70800208ca26e07a379bc6b2b8090805d0a22
2018-06-29 12:40:26 -07:00
Brian Geffon 51c1e510ab Automated rollback of changelist 201596247
PiperOrigin-RevId: 202151720
Change-Id: I0491172c436bbb32b977f557953ba0bc41cfe299
2018-06-26 10:33:24 -07:00
Zhaozhong Ni 0e434b66a6 netstack: tcp socket connected state S/R support.
PiperOrigin-RevId: 201596247
Change-Id: Id22f47b2cdcbe14aa0d930f7807ba75f91a56724
2018-06-21 15:19:45 -07:00
Brian Geffon 257ab8de93 When sending a RST the acceptable ACK window shouldn't change.
Today when we transmit a RST it's happening during the time-wait
flow. Because a FIN is allowed to advance the acceptable ACK window
we're incorrectly doing that for a RST.

PiperOrigin-RevId: 197637565
Change-Id: I080190b06bd0225326cd68c1fbf37bd3fdbd414e
2018-05-22 15:52:41 -07:00
Cyrille Hemidy 04b79137ba Fix misspellings.
PiperOrigin-RevId: 195307689
Change-Id: I499f19af49875a43214797d63376f20ae788d2f4
2018-05-03 14:06:13 -07:00
Googler d02b74a5dc Check in gVisor.
PiperOrigin-RevId: 194583126
Change-Id: Ica1d8821a90f74e7e745962d71801c598c652463
2018-04-28 01:44:26 -04:00