Commit Graph

47 Commits

Author SHA1 Message Date
Kevin Krakauer 52a51a8e20 Add a raw socket transport endpoint and use it for raw ICMP sockets.
Having raw socket code together will make it easier to add support for other raw
network protocols. Currently, only ICMP uses the raw endpoint. However, adding
support for other protocols such as UDP shouldn't be much more difficult than
adding a few switch cases.

PiperOrigin-RevId: 241564875
Change-Id: I77e03adafe4ce0fd29ba2d5dfdc547d2ae8f25bf
2019-04-02 11:13:49 -07:00
Andrei Vagin f4105ac21a netstack/fdbased: add generic segmentation offload (GSO) support
The linux packet socket can handle GSO packets, so we can segment packets to
64K instead of the MTU which is usually 1500.

Here are numbers for the nginx-1m test:
runsc:		579330.01 [Kbytes/sec] received
runsc-gso:	1794121.66 [Kbytes/sec] received
runc:		2122139.06 [Kbytes/sec] received

and for tcp_benchmark:

$ tcp_benchmark  --duration 15   --ideal
[  4]  0.0-15.0 sec  86647 MBytes  48456 Mbits/sec

$ tcp_benchmark --client --duration 15   --ideal
[  4]  0.0-15.0 sec  2173 MBytes  1214 Mbits/sec

$ tcp_benchmark --client --duration 15   --ideal --gso 65536
[  4]  0.0-15.0 sec  19357 MBytes  10825 Mbits/sec

PiperOrigin-RevId: 240809103
Change-Id: I2637f104db28b5d4c64e1e766c610162a195775a
2019-03-28 11:03:41 -07:00
Andrei Vagin 87cce0ec08 netstack: reduce MSS from SYN to account tcp options
See: https://tools.ietf.org/html/rfc6691#section-2
PiperOrigin-RevId: 239305632
Change-Id: Ie8eb912a43332e6490045dc95570709c5b81855e
2019-03-19 17:33:20 -07:00
Tamir Duberstein 5496be7c5d Remove duplicate TCP flag definitions
PiperOrigin-RevId: 238467634
Change-Id: If4cd8efff7386fbee1195f051d15549b495910a9
2019-03-14 10:19:21 -07:00
Ian Gudger 56a6128295 Implement IP_MULTICAST_LOOP.
IP_MULTICAST_LOOP controls whether or not multicast packets sent on the default
route are looped back. In order to implement this switch, support for sending
and looping back multicast packets on the default route had to be implemented.

For now we only support IPv4 multicast.

PiperOrigin-RevId: 237534603
Change-Id: I490ac7ff8e8ebef417c7eb049a919c29d156ac1c
2019-03-08 15:49:17 -08:00
Kevin Krakauer 23e66ee96d Remove unused commit() function argument to Bind.
PiperOrigin-RevId: 236926132
Change-Id: I5cf103f22766e6e65a581de780c7bb9ca0fa3181
2019-03-05 14:53:34 -08:00
Kevin Krakauer 121db29a93 Ping support via IPv4 raw sockets.
Broadly, this change:
* Enables sockets to be created via `socket(AF_INET, SOCK_RAW, IPPROTO_ICMP)`.
* Passes the network-layer (IP) header up the stack to the transport endpoint,
  which can pass it up to the socket layer. This allows a raw socket to return
  the entire IP packet to users.
* Adds functions to stack.TransportProtocol, stack.Stack, stack.transportDemuxer
  that enable incoming packets to be delivered to raw endpoints. New raw sockets
  of other protocols (not ICMP) just need to register with the stack.
* Enables ping.endpoint to return IP headers when created via SOCK_RAW.

PiperOrigin-RevId: 235993280
Change-Id: I60ed994f5ff18b2cbd79f063a7fdf15d093d845a
2019-02-27 14:31:21 -08:00
Bhasker Hariharan 26be25e4ec Add a SACK scoreboard to TCP endpoints.
This change does not make use of SACK information but adds support to track
SACK information and store it in the endpoint.

The actual SACK based recovery will be in a separate CL.

Part of commits to add RFC 6675 support to Netstack.

PiperOrigin-RevId: 235612264
Change-Id: I261f94844d7bad5abda803152ce6cc6125a467ff
2019-02-25 15:20:04 -08:00
Amanda Tait ea070b9d5f Implement Broadcast support
This change adds support for the SO_BROADCAST socket option in gVisor Netstack.
This support includes getsockopt()/setsockopt() functionality for both UDP and
TCP endpoints (the latter being a NOOP), dispatching broadcast messages up and
down the stack, and route finding/creation for broadcast packets. Finally, a
suite of tests have been implemented, exercising this functionality through the
Linux syscall API.

PiperOrigin-RevId: 234850781
Change-Id: If3e666666917d39f55083741c78314a06defb26c
2019-02-20 12:54:13 -08:00
Zhaozhong Ni 7182b9cf52 netstack: release port inline for listening sockets only.
PiperOrigin-RevId: 229243918
Change-Id: Ie14ef34e66ae851ed080f57b7d26a369a66f7664
2019-01-14 13:33:47 -08:00
Andrei Vagin 652d068119 Implement SO_REUSEPORT for TCP and UDP sockets
This option allows multiple sockets to be bound to the same port.

Incoming packets are distributed to sockets using a hash based on source and
destination addresses. This means that all packets from one sender will be
received by the same server socket.

PiperOrigin-RevId: 227153413
Change-Id: I59b6edda9c2209d5b8968671e9129adb675920cf
2018-12-28 11:27:14 -08:00
Ian Gudger 0df0df35fc Stub out SO_OOBINLINE.
We don't explicitly support out-of-band data and treat it like normal in-band
data. This is equilivent to SO_OOBINLINE being enabled, so always report that
it is enabled.

PiperOrigin-RevId: 226572742
Change-Id: I4c30ccb83265e76c30dea631cbf86822e6ee1c1b
2018-12-21 19:46:55 -08:00
Ian Gudger b515556519 Implement SO_KEEPALIVE, TCP_KEEPIDLE, and TCP_KEEPINTVL.
Within gVisor, plumb new socket options to netstack.

Within netstack, fix GetSockOpt and SetSockOpt return value logic.

PiperOrigin-RevId: 226532229
Change-Id: If40734e119eed633335f40b4c26facbebc791c74
2018-12-21 13:13:45 -08:00
Ian Gudger 25b8424d75 Stub out TCP_QUICKACK
PiperOrigin-RevId: 224696233
Change-Id: I45c425d9e32adee5dcce29ca7439a06567b26014
2018-12-09 00:50:33 -08:00
Ian Gudger 000fa84a3b Fix tcpip.Endpoint.Write contract regarding short writes
* Clarify tcpip.Endpoint.Write contract regarding short writes.
* Enforce tcpip.Endpoint.Write contract regarding short writes.
* Update relevant users of tcpip.Endpoint.Write.

PiperOrigin-RevId: 224377586
Change-Id: I24299ecce902eb11317ee13dae3b8d8a7c5b097d
2018-12-06 11:41:33 -08:00
Ian Gudger 9d8e49d950 Process delayed packets when delay is disabled
Moving the wakeup logic into the disable blocks is an optimization.

PiperOrigin-RevId: 221677028
Change-Id: Ib5a5a6d52cc77b4bbc5dedcad9ee1dbb3da98deb
2018-11-15 13:17:06 -08:00
Ian Gudger 7f60294a73 Implement TCP_NODELAY and TCP_CORK
Previously, TCP_NODELAY was always enabled and we would lie about it being
configurable. TCP_NODELAY is now disabled by default (to match Linux) in the
socket layer so that non-gVisor users don't automatically start using this
questionable optimization.

PiperOrigin-RevId: 221368472
Change-Id: Ib0240f66d94455081f4e0ca94f09d9338b2c1356
2018-11-13 18:02:43 -08:00
Ian Gudger 37cbce1f91 Merge segments in sender's writeList
PiperOrigin-RevId: 220185891
Change-Id: Iaea73fd7b2fa8c399b989cdcaabf4885f370df4b
2018-11-05 15:39:30 -08:00
Ian Gudger 8fce67af24 Use correct company name in copyright header
PiperOrigin-RevId: 217951017
Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
2018-10-19 16:35:11 -07:00
Sepehr Raissian c17ea8c6e2 Block for link address resolution
Previously, if address resolution for UDP or Ping sockets required sending
packets using Write in Transport layer, Resolve would return ErrWouldBlock
and Write would return ErrNoLinkAddress. Meanwhile startAddressResolution
would run in background. Further calls to Write using same address would also
return ErrNoLinkAddress until resolution has been completed successfully.

Since Write is not allowed to block and System Calls need to be
interruptible in System Call layer, the caller to Write is responsible for
blocking upon return of ErrWouldBlock.

Now, when startAddressResolution is called a notification channel for
the completion of the address resolution is returned.
The channel will traverse up to the calling function of Write as well as
ErrNoLinkAddress. Once address resolution is complete (success or not) the
channel is closed. The caller would call Write again to send packets and
check if address resolution was compeleted successfully or not.

Fixes google/gvisor#5

Change-Id: Idafaf31982bee1915ca084da39ae7bd468cebd93
PiperOrigin-RevId: 214962200
2018-09-28 11:00:16 -07:00
Ian Gudger 117ac8bc5b Fix data race on tcp.endpoint.hardError in tcp.(*endpoint).Read
tcp.endpoint.hardError is protected by tcp.endpoint.mu.

PiperOrigin-RevId: 213730698
Change-Id: I4e4f322ac272b145b500b1a652fbee0c7b985be2
2018-09-19 17:49:18 -07:00
Tamir Duberstein d6409b6564 Prevent TCP connect from picking bound ports
PiperOrigin-RevId: 213387851
Change-Id: Icc6850761bc11afd0525f34863acd77584155140
2018-09-17 20:44:04 -07:00
Tamir Duberstein d689f8422f Always pass buffer.VectorisedView by value
PiperOrigin-RevId: 212757571
Change-Id: I04200df9e45c21eb64951cd2802532fa84afcb1a
2018-09-12 21:57:55 -07:00
Tamir Duberstein bc5e18c9d1 Implement TCP keepalives
PiperOrigin-RevId: 211670620
Change-Id: Ia8a3d8ae53a7fece1dee08ee9c74964bd7f71bb7
2018-09-05 11:48:23 -07:00
Tamir Duberstein 3794cb6bff Expose TCP RTT
PiperOrigin-RevId: 211504634
Change-Id: I9a7bcbbdd40e5036894930f709278725ef477293
2018-09-04 12:39:47 -07:00
Tamir Duberstein 0923bcf06b Add various statistics
PiperOrigin-RevId: 210442599
Change-Id: I9498351f461dc69c77b7f815d526c5693bec8e4a
2018-08-27 15:29:55 -07:00
Ian Gudger abe7764928 Encapsulate netstack metrics
PiperOrigin-RevId: 209943212
Change-Id: I96dcbc7c2ab2426e510b94a564436505256c5c79
2018-08-23 08:55:23 -07:00
Bhasker Hariharan 7d3684aadf Adds support to dump out cubic internal state.
PiperOrigin-RevId: 207754087
Change-Id: I83abce64348ea93f8692da81a881b364dae2158b
2018-08-07 11:49:57 -07:00
Brian Geffon d839dc13c6 Netstack doesn't handle sending after SHUT_WR correctly.
PiperOrigin-RevId: 207715032
Change-Id: I7b6690074c5be283145192895d706a92e921b22c
2018-08-07 07:57:20 -07:00
Bhasker Hariharan 56fa562dda Cubic implementation for Netstack.
This CL implements CUBIC as described in https://tools.ietf.org/html/rfc8312.

PiperOrigin-RevId: 207353142
Change-Id: I329cbf3277f91127e99e488f07d906f6779c6603
2018-08-03 17:54:42 -07:00
Zhaozhong Ni 57d0fcbdbf Automated rollback of changelist 207037226
PiperOrigin-RevId: 207125440
Change-Id: I6c572afb4d693ee72a0c458a988b0e96d191cd49
2018-08-02 10:42:48 -07:00
Michael Pratt 60add78980 Automated rollback of changelist 207007153
PiperOrigin-RevId: 207037226
Change-Id: I8b5f1a056d4f3eab17846f2e0193bb737ecb5428
2018-08-01 19:57:32 -07:00
Zhaozhong Ni b9e1cf8404 stateify: convert all packages to use explicit mode.
PiperOrigin-RevId: 207007153
Change-Id: Ifedf1cc3758dc18be16647a4ece9c840c1c636c9
2018-08-01 15:43:24 -07:00
Zhaozhong Ni 45c50eb124 netstack: save tcp endpoint accepted channel directly.
PiperOrigin-RevId: 204356873
Change-Id: I5e2f885f58678e693aae1a69e8bf8084a685af28
2018-07-12 13:49:21 -07:00
Zhaozhong Ni b1683df90b netstack: tcp socket connected state S/R support.
PiperOrigin-RevId: 203958972
Change-Id: Ia6fe16547539296d48e2c6731edacdd96bd6e93c
2018-07-10 09:23:35 -07:00
Brian Geffon da9b5153f2 Fix two race conditions in tcp stack.
PiperOrigin-RevId: 203880278
Change-Id: I66b790a616de59142859cc12db4781b57ea626d3
2018-07-09 20:48:27 -07:00
Nicolas Lacasse bf0fa09537 Switch netstack licenses to Apache 2.0.
Fixes #27

PiperOrigin-RevId: 203825288
Change-Id: Ie9f3a2b2c1e296b026b024f75c07da1a7e118633
2018-07-09 14:04:40 -07:00
Brian Geffon 23f49097c7 Panic in netstack during cleanup where a FIN becomes a RST.
There is a subtle bug where during cleanup with unread data a FIN can
be converted to a RST, at that point the entire connection should be
aborted as we're not expecting any ACKs to the RST.

PiperOrigin-RevId: 202691271
Change-Id: Idae70800208ca26e07a379bc6b2b8090805d0a22
2018-06-29 12:40:26 -07:00
Brian Geffon 51c1e510ab Automated rollback of changelist 201596247
PiperOrigin-RevId: 202151720
Change-Id: I0491172c436bbb32b977f557953ba0bc41cfe299
2018-06-26 10:33:24 -07:00
Zhaozhong Ni 0e434b66a6 netstack: tcp socket connected state S/R support.
PiperOrigin-RevId: 201596247
Change-Id: Id22f47b2cdcbe14aa0d930f7807ba75f91a56724
2018-06-21 15:19:45 -07:00
Michael Pratt bd2d1aaa16 Replace crypto/rand with internal rand package
PiperOrigin-RevId: 200784607
Change-Id: I39aa6ee632936dcbb00fc298adccffa606e9f4c0
2018-06-15 15:36:00 -07:00
Zhaozhong Ni 343020ca27 netstack: make TCP endpoint closed and error state cleanup work synchronous.
So that when saving TCP endpoint in these states, there is no pending or
background activities.

Also lift tcp network save rejection error to tcpip package.

PiperOrigin-RevId: 199370748
Change-Id: Ief7b45c2a7338d12414cd7c23db95de6a9c22700
2018-06-05 15:44:38 -07:00
Fabricio Voznika c5dc873e44 Automated rollback of changelist 196886839
PiperOrigin-RevId: 198457660
Change-Id: I6ea5cf0b4cfe2b5ba455325a7e5299880e5a088a
2018-05-29 14:24:07 -07:00
Zhaozhong Ni 5b4c20e1b8 netstack: make TCP endpoint closed and error state cleanup work synchronous.
So that when saving TCP endpoint in these states, there is no pending or
background activities.

Also lift tcp network save rejection error to tcpip package.

PiperOrigin-RevId: 196886839
Change-Id: I0fe73750f2743ec7e62d139eb2cec758c5dd6698
2018-05-16 14:15:24 -07:00
Zhaozhong Ni 987f7841a6 netstack: TCP connecting state endpoint save / restore support.
PiperOrigin-RevId: 196325647
Change-Id: I850eb4a29b9c679da4db10eb164bbdf967690663
2018-05-11 16:28:39 -07:00
Ian Gudger 3d3deef573 Implement SO_TIMESTAMP
PiperOrigin-RevId: 195047018
Change-Id: I6d99528a00a2125f414e1e51e067205289ec9d3d
2018-05-01 22:11:49 -07:00
Googler d02b74a5dc Check in gVisor.
PiperOrigin-RevId: 194583126
Change-Id: Ica1d8821a90f74e7e745962d71801c598c652463
2018-04-28 01:44:26 -04:00