Commit Graph

118 Commits

Author SHA1 Message Date
Kevin Krakauer 52a51a8e20 Add a raw socket transport endpoint and use it for raw ICMP sockets.
Having raw socket code together will make it easier to add support for other raw
network protocols. Currently, only ICMP uses the raw endpoint. However, adding
support for other protocols such as UDP shouldn't be much more difficult than
adding a few switch cases.

PiperOrigin-RevId: 241564875
Change-Id: I77e03adafe4ce0fd29ba2d5dfdc547d2ae8f25bf
2019-04-02 11:13:49 -07:00
Bhasker Hariharan 45c54b1f4e Fix incorrect checksums in TCP and UDP tests.
PiperOrigin-RevId: 241025361
Change-Id: I292e7aea9a4b294b11e4f736e107010d9524586b
2019-03-29 12:05:43 -07:00
Bhasker Hariharan cc0e96a4bd Fix Panic in SACKScoreboard.Delete.
The panic was caused by modifying the tree while iterating which invalidated the
iterator.

Also fixes another bug in SACKScoreboard.Insert() which was causing blocks to be
merged incorrectly.

PiperOrigin-RevId: 240895053
Change-Id: Ia72b8244297962df5c04283346da5226434740af
2019-03-28 18:18:39 -07:00
Andrei Vagin f4105ac21a netstack/fdbased: add generic segmentation offload (GSO) support
The linux packet socket can handle GSO packets, so we can segment packets to
64K instead of the MTU which is usually 1500.

Here are numbers for the nginx-1m test:
runsc:		579330.01 [Kbytes/sec] received
runsc-gso:	1794121.66 [Kbytes/sec] received
runc:		2122139.06 [Kbytes/sec] received

and for tcp_benchmark:

$ tcp_benchmark  --duration 15   --ideal
[  4]  0.0-15.0 sec  86647 MBytes  48456 Mbits/sec

$ tcp_benchmark --client --duration 15   --ideal
[  4]  0.0-15.0 sec  2173 MBytes  1214 Mbits/sec

$ tcp_benchmark --client --duration 15   --ideal --gso 65536
[  4]  0.0-15.0 sec  19357 MBytes  10825 Mbits/sec

PiperOrigin-RevId: 240809103
Change-Id: I2637f104db28b5d4c64e1e766c610162a195775a
2019-03-28 11:03:41 -07:00
Andrei Vagin 654e878abb netstack: Don't exclude length when a pseudo-header checksum is calculated
This is a preparation for GSO changes (cl/234508902).

RELNOTES[gofers]: Refactor checksum code to include length, which
it already did, but in a convoluted way. Should be a no-op.

PiperOrigin-RevId: 240460794
Change-Id: I537381bc670b5a9f5d70a87aa3eb7252e8f5ace2
2019-03-26 17:15:13 -07:00
Andrei Vagin 9f4e1cb797 netstack: adjust the sequence number after trimming the packet
PiperOrigin-RevId: 239417224
Change-Id: I14a9adc31a6330a79a6156c105969cd5f1f63d20
2019-03-20 09:58:10 -07:00
Andrei Vagin 87cce0ec08 netstack: reduce MSS from SYN to account tcp options
See: https://tools.ietf.org/html/rfc6691#section-2
PiperOrigin-RevId: 239305632
Change-Id: Ie8eb912a43332e6490045dc95570709c5b81855e
2019-03-19 17:33:20 -07:00
Tamir Duberstein 5496be7c5d Remove duplicate TCP flag definitions
PiperOrigin-RevId: 238467634
Change-Id: If4cd8efff7386fbee1195f051d15549b495910a9
2019-03-14 10:19:21 -07:00
Kevin Krakauer f97c4f1b7a Remove unused function.
PiperOrigin-RevId: 238336475
Change-Id: I8131e04699028246ebc233953ebb3feca5673940
2019-03-13 16:40:10 -07:00
Ian Gudger 86036f979b Validate multicast addresses in multicast group operations.
PiperOrigin-RevId: 237559843
Change-Id: I93a9d83a08cd3d49d5fc7fcad5b0710d0aa04aaa
2019-03-08 19:05:26 -08:00
Ian Gudger 56a6128295 Implement IP_MULTICAST_LOOP.
IP_MULTICAST_LOOP controls whether or not multicast packets sent on the default
route are looped back. In order to implement this switch, support for sending
and looping back multicast packets on the default route had to be implemented.

For now we only support IPv4 multicast.

PiperOrigin-RevId: 237534603
Change-Id: I490ac7ff8e8ebef417c7eb049a919c29d156ac1c
2019-03-08 15:49:17 -08:00
Bhasker Hariharan 1718fdd1a8 Add new retransmissions and recovery related metrics.
PiperOrigin-RevId: 236945145
Change-Id: I051760d95154ea5574c8bb6aea526f488af5e07b
2019-03-05 16:41:44 -08:00
Kevin Krakauer 23e66ee96d Remove unused commit() function argument to Bind.
PiperOrigin-RevId: 236926132
Change-Id: I5cf103f22766e6e65a581de780c7bb9ca0fa3181
2019-03-05 14:53:34 -08:00
Kevin Krakauer 121db29a93 Ping support via IPv4 raw sockets.
Broadly, this change:
* Enables sockets to be created via `socket(AF_INET, SOCK_RAW, IPPROTO_ICMP)`.
* Passes the network-layer (IP) header up the stack to the transport endpoint,
  which can pass it up to the socket layer. This allows a raw socket to return
  the entire IP packet to users.
* Adds functions to stack.TransportProtocol, stack.Stack, stack.transportDemuxer
  that enable incoming packets to be delivered to raw endpoints. New raw sockets
  of other protocols (not ICMP) just need to register with the stack.
* Enables ping.endpoint to return IP headers when created via SOCK_RAW.

PiperOrigin-RevId: 235993280
Change-Id: I60ed994f5ff18b2cbd79f063a7fdf15d093d845a
2019-02-27 14:31:21 -08:00
Bhasker Hariharan 26be25e4ec Add a SACK scoreboard to TCP endpoints.
This change does not make use of SACK information but adds support to track
SACK information and store it in the endpoint.

The actual SACK based recovery will be in a separate CL.

Part of commits to add RFC 6675 support to Netstack.

PiperOrigin-RevId: 235612264
Change-Id: I261f94844d7bad5abda803152ce6cc6125a467ff
2019-02-25 15:20:04 -08:00
Kevin Krakauer b75aa51504 Rename ping endpoints to icmp endpoints.
PiperOrigin-RevId: 235248572
Change-Id: I5b0538b6feb365a98712c2a2d56d856fe80a8a09
2019-02-22 13:34:47 -08:00
Amanda Tait ea070b9d5f Implement Broadcast support
This change adds support for the SO_BROADCAST socket option in gVisor Netstack.
This support includes getsockopt()/setsockopt() functionality for both UDP and
TCP endpoints (the latter being a NOOP), dispatching broadcast messages up and
down the stack, and route finding/creation for broadcast packets. Finally, a
suite of tests have been implemented, exercising this functionality through the
Linux syscall API.

PiperOrigin-RevId: 234850781
Change-Id: If3e666666917d39f55083741c78314a06defb26c
2019-02-20 12:54:13 -08:00
Ian Gudger c611dbc5a7 Implement IP_MULTICAST_IF.
This allows setting a default send interface for IPv4 multicast. IPv6 support
will come later.

PiperOrigin-RevId: 234251379
Change-Id: I65922341cd8b8880f690fae3eeb7ddfa47c8c173
2019-02-15 18:40:15 -08:00
Kevin Krakauer a9cb3dcd9d Move SO_TIMESTAMP from different transport endpoints to epsocket.
SO_TIMESTAMP is reimplemented in ping and UDP sockets (and needs to be added for
TCP), but can just be implemented in epsocket for simplicity. This will also
make SIOCGSTAMP easier to implement.

PiperOrigin-RevId: 234179300
Change-Id: Ib5ea0b1261dc218c1a8b15a65775de0050fe3230
2019-02-15 11:18:44 -08:00
Googler d60ce17a21 Internal change.
PiperOrigin-RevId: 234011346
Change-Id: Ic69375ddb3794dd0d3d6e62ee4dc60fdf4baf2c7
2019-02-14 12:54:27 -08:00
Bhasker Hariharan efe5e737d7 Do not drop packets w/ missing TCP timestamps.
RFC7323 recommends that if the timestamp option was negotiated
then all packets should carry a TCP Timestamp and any packets that
do not should be dropped.

Netstack implemented this behaviour. Linux OTOH does not and will
accept such packets. This change makes Netstack behaviour compatible
with Linux.

Also now that we allow such packets, we do need to update RTO calculations
based on these packets even if timestamp option is enabled.

PiperOrigin-RevId: 233432268
Change-Id: I9f4742ae6b63930ac3b5e37d8c238761e6a4b29f
2019-02-11 10:23:43 -08:00
Ian Gudger 80f901b16b Plumb IP_ADD_MEMBERSHIP and IP_DROP_MEMBERSHIP to netstack.
Also includes a few fixes for IPv4 multicast support. IPv6 support is coming in
a followup CL.

PiperOrigin-RevId: 233008638
Change-Id: If7dae6222fef43fda48033f0292af77832d95e82
2019-02-07 23:15:23 -08:00
Michael Pratt 2a0c69b19f Remove license comments
Nothing reads them and they can simply get stale.

Generated with:
$ sed -i "s/licenses(\(.*\)).*/licenses(\1)/" **/BUILD

PiperOrigin-RevId: 231818945
Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0
2019-01-31 11:12:53 -08:00
Bhasker Hariharan f03c7e48e7 Fix IsLost check to match the description in RFC6675.
quoting what "rscheff@gmx.at" pointed out over email.
"IsLost in RFC3517 is defined as  >=  (DupThresh * SMSS) while
RFC6675 improves upon this, and defines IsLost as  >
((DupThresh - 1) * SMSS + 1).

The latter addresses situations where partial segments (size < MSS)
are sent (eg. last segment of a http protocol message sent with PSH
being less than MSS is common)."

PiperOrigin-RevId: 231512331
Change-Id: I1addd4a92e3e7baeb0bdda46463ebfae435da958
2019-01-29 18:13:48 -08:00
Kevin Krakauer 9a01287d23 test: Tag tcp_test as flaky.
PiperOrigin-RevId: 229427852
Change-Id: I9de8ed63f4a7672dacd3b282c863c599d00acd52
2019-01-15 13:21:00 -08:00
Zhaozhong Ni 7182b9cf52 netstack: release port inline for listening sockets only.
PiperOrigin-RevId: 229243918
Change-Id: Ie14ef34e66ae851ed080f57b7d26a369a66f7664
2019-01-14 13:33:47 -08:00
Andrei Vagin 652d068119 Implement SO_REUSEPORT for TCP and UDP sockets
This option allows multiple sockets to be bound to the same port.

Incoming packets are distributed to sockets using a hash based on source and
destination addresses. This means that all packets from one sender will be
received by the same server socket.

PiperOrigin-RevId: 227153413
Change-Id: I59b6edda9c2209d5b8968671e9129adb675920cf
2018-12-28 11:27:14 -08:00
Ian Gudger 0df0df35fc Stub out SO_OOBINLINE.
We don't explicitly support out-of-band data and treat it like normal in-band
data. This is equilivent to SO_OOBINLINE being enabled, so always report that
it is enabled.

PiperOrigin-RevId: 226572742
Change-Id: I4c30ccb83265e76c30dea631cbf86822e6ee1c1b
2018-12-21 19:46:55 -08:00
Michael Pratt 71f0d5108b Internal Change
PiperOrigin-RevId: 226542979
Change-Id: Ife11ebd0a85b8a63078e6daa71b4a99a82080ac9
2018-12-21 14:29:35 -08:00
Ian Gudger b515556519 Implement SO_KEEPALIVE, TCP_KEEPIDLE, and TCP_KEEPINTVL.
Within gVisor, plumb new socket options to netstack.

Within netstack, fix GetSockOpt and SetSockOpt return value logic.

PiperOrigin-RevId: 226532229
Change-Id: If40734e119eed633335f40b4c26facbebc791c74
2018-12-21 13:13:45 -08:00
Ian Gudger 6253d32cc9 transport/tcp: remove unused error return values
PiperOrigin-RevId: 225421480
Change-Id: I1e9259b0b7e8490164e830b73338a615129c7f0e
2018-12-13 13:02:49 -08:00
Ian Gudger 25b8424d75 Stub out TCP_QUICKACK
PiperOrigin-RevId: 224696233
Change-Id: I45c425d9e32adee5dcce29ca7439a06567b26014
2018-12-09 00:50:33 -08:00
Ian Gudger 000fa84a3b Fix tcpip.Endpoint.Write contract regarding short writes
* Clarify tcpip.Endpoint.Write contract regarding short writes.
* Enforce tcpip.Endpoint.Write contract regarding short writes.
* Update relevant users of tcpip.Endpoint.Write.

PiperOrigin-RevId: 224377586
Change-Id: I24299ecce902eb11317ee13dae3b8d8a7c5b097d
2018-12-06 11:41:33 -08:00
Zhaozhong Ni 7f35daddd2 sentry: support save / restore of TCP bind socket after shutdown.
PiperOrigin-RevId: 224227677
Change-Id: I08b0e0c0574170556269900653e5bcf9e9e5c9c9
2018-12-05 15:02:40 -08:00
Zhaozhong Ni fda4557e3d sentry: skip waiting for undrain for netstack TCP endpoints in error state.
PiperOrigin-RevId: 224214981
Change-Id: I4c1dd5b1c856f7a4f9866a5dda44a5297e92486a
2018-12-05 13:51:16 -08:00
Ian Gudger 8cbd6153a6 Fix available calculation when merging TCP segments
PiperOrigin-RevId: 224033418
Change-Id: I780be973e8be68ac93e8c9e7a100002e912f40d2
2018-12-04 13:15:25 -08:00
Zhaozhong Ni ad8f293e1a sentry: save copy of tcp segment's delivered views to avoid in-struct pointers.
PiperOrigin-RevId: 224033238
Change-Id: Ie5b1854b29340843b02c123766d290a8738d7631
2018-12-04 13:14:24 -08:00
Ian Gudger 99fb113869 Test that full segments will be sent when delay/cork is enabled
PiperOrigin-RevId: 223425575
Change-Id: Idd777e04c69e6ffcbfb0bdbea828a8b8b42d7672
2018-11-29 15:46:38 -08:00
Ian Gudger 9d8e49d950 Process delayed packets when delay is disabled
Moving the wakeup logic into the disable blocks is an optimization.

PiperOrigin-RevId: 221677028
Change-Id: Ib5a5a6d52cc77b4bbc5dedcad9ee1dbb3da98deb
2018-11-15 13:17:06 -08:00
Ian Gudger b5e91eaa52 Clean up tcp.sendData
PiperOrigin-RevId: 221484739
Change-Id: I44c71f79f99d0d00a2e70a7f06d7024a62a5de0a
2018-11-14 11:58:41 -08:00
Ian Gudger 7f60294a73 Implement TCP_NODELAY and TCP_CORK
Previously, TCP_NODELAY was always enabled and we would lie about it being
configurable. TCP_NODELAY is now disabled by default (to match Linux) in the
socket layer so that non-gVisor users don't automatically start using this
questionable optimization.

PiperOrigin-RevId: 221368472
Change-Id: Ib0240f66d94455081f4e0ca94f09d9338b2c1356
2018-11-13 18:02:43 -08:00
Ian Gudger c22da3e705 Remove obsolete TODO
PiperOrigin-RevId: 221117846
Change-Id: I2a43fd8135b1d1194ff81e98644ce6b6182ece50
2018-11-12 10:45:19 -08:00
Bhasker Hariharan 33089561b1 Add an implementation of a SACK scoreboard as per RFC6675.
PiperOrigin-RevId: 220866996
Change-Id: I89d48215df57c00d6a6ec512fc18712a2ea9080b
2018-11-09 14:38:46 -08:00
Ian Gudger 37cbce1f91 Merge segments in sender's writeList
PiperOrigin-RevId: 220185891
Change-Id: Iaea73fd7b2fa8c399b989cdcaabf4885f370df4b
2018-11-05 15:39:30 -08:00
Ian Gudger 59b7766af7 Fix a race where keepalives could be sent while there is pending data
PiperOrigin-RevId: 219571556
Change-Id: I5a1042c1cb05eb2711eb01627fd298bad6c543a6
2018-10-31 18:42:44 -07:00
Fabricio Voznika c99006a240 Mark netstack/tcpip/transport/tcp:tcp_test flaky
PiperOrigin-RevId: 218537640
Change-Id: I1c5f55a46390174e1f5caeff74b1a364fa3268d9
2018-10-24 10:46:25 -07:00
Adin Scannell 1369e17504 Remove blanket TODO, as it is self-evident.
PiperOrigin-RevId: 218390517
Change-Id: Ic891c1626e62a6c4ed57f8180740872bcd1be177
2018-10-23 12:52:27 -07:00
Adin Scannell 75cd70ecc9 Track paths and provide a rename hook.
This change also adds extensive testing to the p9 package via mocks. The sanity
checks and type checks are moved from the gofer into the core package, where
they can be more easily validated.

PiperOrigin-RevId: 218296768
Change-Id: I4fc3c326e7bf1e0e140a454cbacbcc6fd617ab55
2018-10-23 00:20:15 -07:00
Ian Gudger 8fce67af24 Use correct company name in copyright header
PiperOrigin-RevId: 217951017
Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
2018-10-19 16:35:11 -07:00
Ian Gudger 6cba410df0 Move Unix transport out of netstack
PiperOrigin-RevId: 217557656
Change-Id: I63d27635b1a6c12877279995d2d9847b6a19da9b
2018-10-17 11:37:51 -07:00