Commit Graph

246 Commits

Author SHA1 Message Date
Rahat Mahmood 13a98df49e netstack: Don't start endpoint goroutines too soon on restore.
Endpoint protocol goroutines were previously started as part of
loading the endpoint. This is potentially too soon, as resources used
by these goroutine may not have been loaded. Protocol goroutines may
perform meaningful work as soon as they're started (ex: incoming
connect) which can cause them to indirectly access resources that
haven't been loaded yet.

This CL defers resuming all protocol goroutines until the end of
restore.

PiperOrigin-RevId: 262409429
2019-08-08 12:33:11 -07:00
Tamir Duberstein 67a3f4039d Set target address in ARP Reply
PiperOrigin-RevId: 262163794
2019-08-07 10:27:43 -07:00
Bhasker Hariharan dfbc0b0a4c Fix for a panic due to writing to a closed accept channel.
This can happen because endpoint.Close() closes the accept channel first and
then drains/resets any accepted but not delivered connections. But there can be
connections that are connected but not delivered to the channel as the channel
was full. But closing the channel can cause these writes to fail with a write to
a closed channel.

The correct solution is to abort any connections in SYN-RCVD state and
drain/abort all completed connections before closing the accept channel.

PiperOrigin-RevId: 261951132
2019-08-06 11:01:27 -07:00
Kevin Krakauer 810cc07aab Plumbing for iptables sockopts.
PiperOrigin-RevId: 261413396
2019-08-02 16:26:48 -07:00
Rahat Mahmood 2906dffcdb Automated rollback of changelist 261191548
PiperOrigin-RevId: 261373749
2019-08-02 12:52:40 -07:00
Rahat Mahmood 79511e8a50 Implement getsockopt(TCP_INFO).
Export some readily-available fields for TCP_INFO and stub out the rest.

PiperOrigin-RevId: 261191548
2019-08-01 13:58:48 -07:00
Austin Kiekintveld 12c4eb294a Fix ICMPv4 EchoReply packet checksum
The checksum was not being reset before being re-calculated and sent out.
This caused the sent checksum to always be `0x0800`.

Fixes #605.

PiperOrigin-RevId: 260965059
2019-07-31 11:26:41 -07:00
Tamir Duberstein c6e6d92cb1 Test connecting UDP sockets to the ANY address
This doesn't currently pass on gVisor.

While I'm here, fix a bug where connecting to the v6-mapped v4 address doesn't
work in gVisor.

PiperOrigin-RevId: 260923961
2019-07-31 07:41:20 -07:00
Tamir Duberstein 7369c63e42 Pass ProtocolAddress instead of its fields
PiperOrigin-RevId: 260803517
2019-07-30 15:06:39 -07:00
Haibo Xu 1decf76471 Change syscall.POLL to syscall.PPOLL.
syscall.POLL is not supported on arm64, using syscall.PPOLL
to support both the x86 and arm64. refs #63

Signed-off-by: Haibo Xu <haibo.xu@arm.com>
Change-Id: I2c81a063d3ec4e7e6b38fe62f17a0924977f505e
COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/543 from xiaobo55x:master ba598263fd3748d1addd48e4194080aa12085164
PiperOrigin-RevId: 260752049
2019-07-30 11:01:29 -07:00
Chris Kuiper 40e682759f Add support for a subnet prefix length on interface network addresses
This allows the user code to add a network address with a subnet prefix length.
The prefix length value is stored in the network endpoint and provided back to
the user in the ProtocolAddress type.

PiperOrigin-RevId: 259807693
2019-07-24 13:42:14 -07:00
Tamir Duberstein 12c256568b Deduplicate EndpointState.connected some
This fixes a bug introduced in cl/251934850 that caused
connect-accept-close-connect races to result in the second connect call
failiing when it should have succeeded.

PiperOrigin-RevId: 259584525
2019-07-23 12:10:18 -07:00
Chris Kuiper 0e040ba6e8 Handle interfaceAddr and NIC options separately for IP_MULTICAST_IF
This tweaks the handling code for IP_MULTICAST_IF to ignore the InterfaceAddr
if a NICID is given.

PiperOrigin-RevId: 258982541
2019-07-19 09:29:04 -07:00
Andrei Vagin eefa817cfd net/tcp/setockopt: impelment setsockopt(fd, SOL_TCP, TCP_INQ)
PiperOrigin-RevId: 258859507
2019-07-18 15:41:04 -07:00
gVisor bot 74dc663bbb Internal change.
PiperOrigin-RevId: 258424489
2019-07-16 13:03:37 -07:00
Kevin Krakauer 9b4d3280e1 Add IPPROTO_RAW, which allows raw sockets to write IP headers.
iptables also relies on IPPROTO_RAW in a way. It opens such a socket to
manipulate the kernel's tables, but it doesn't actually use any of the
functionality. Blegh.

PiperOrigin-RevId: 257903078
2019-07-12 18:09:12 -07:00
Tamir Duberstein 17bab652af Check that IP headers contain correct version
PiperOrigin-RevId: 257888338
2019-07-12 16:19:18 -07:00
Bhasker Hariharan 6116473b2f Stub out support for TCP_MAXSEG.
Adds support to set/get the TCP_MAXSEG value but does not
really change the segment sizes emitted by netstack or
alter the MSS advertised by the endpoint. This is currently
being added only to unblock iperf3 on gVisor. Plumbing
this correctly requires a bit more work which will come
in separate CLs.

PiperOrigin-RevId: 257859112
2019-07-12 13:35:17 -07:00
Andrei Vagin 116cac053e netstack/udp: connect with the AF_UNSPEC address family means disconnect
PiperOrigin-RevId: 256433283
2019-07-03 14:19:02 -07:00
gVisor bot d60ae0ddee Merge pull request #279 from kevinGC:iptables-1-pkg
PiperOrigin-RevId: 256231055
2019-07-02 13:48:06 -07:00
Michael Pratt 5b41ba5d0e Fix various spelling issues in the documentation
Addresses obvious typos, in the documentation only.

COPYBARA_INTEGRATE_REVIEW=https://github.com/google/gvisor/pull/443 from Pixep:fix/documentation-spelling 4d0688164eafaf0b3010e5f4824b35d1e7176d65
PiperOrigin-RevId: 255477779
2019-06-27 14:25:50 -07:00
Bhasker Hariharan c1761378a9 Fix the logic for sending zero window updates.
Today we have the logic split in two places between endpoint Read() and the
worker goroutine which actually sends a zero window. This change makes it so
that when a zero window ACK is sent we set a flag in the endpoint which can be
read by the endpoint to decide if it should notify the worker to send a
nonZeroWindow update.

The worker now does not do the check again but instead sends an ACK and flips
the flag right away.

Similarly today when SO_RECVBUF is set the SetSockOpt call has logic
to decide if a zero window update is required. Rather than do that we move
the logic to the worker goroutine and it can check the zeroWindow flag
and send an update if required.

PiperOrigin-RevId: 254505447
2019-06-21 18:31:31 -07:00
Brad Burlage ae4ef32b8c Deflake TestSimpleReceive failures due to timeouts
This test will occasionally fail waiting to read a packet. From repeated runs,
I've seen it up to 1.5s for waitForPackets to complete.

PiperOrigin-RevId: 254484627
2019-06-21 15:56:12 -07:00
Bhasker Hariharan 3d71c627fa Add support for TCP receive buffer auto tuning.
The implementation is similar to linux where we track the number of bytes
consumed by the application to grow the receive buffer of a given TCP endpoint.

This ensures that the advertised window grows at a reasonable rate to accomodate
for the sender's rate and prevents large amounts of data being held in stack
buffers if the application is not actively reading or not reading fast enough.

The original paper that was used to implement the linux receive buffer auto-
tuning is available @ https://public.lanl.gov/radiant/pubs/drs/lacsi2001.pdf

NOTE: Linux does not implement DRS as defined in that paper, it's just a good
reference to understand the solution space.

Updates #230

PiperOrigin-RevId: 253168283
2019-06-13 22:28:01 -07:00
Adin Scannell add40fd6ad Update canonical repository.
This can be merged after:
https://github.com/google/gvisor-website/pull/77
  or
https://github.com/google/gvisor-website/pull/78

PiperOrigin-RevId: 253132620
2019-06-13 16:50:15 -07:00
Adin Scannell e352f46478 Minor BUILD file cleanup.
PiperOrigin-RevId: 252918338
2019-06-12 15:59:46 -07:00
Kevin Krakauer 0bbbcafd68 Merge branch 'master' into iptables-1-pkg
Change-Id: I7457a11de4725e1bf3811420c505d225b1cb6943
2019-06-12 15:21:22 -07:00
Bhasker Hariharan 70578806e8 Add support for TCP_CONGESTION socket option.
This CL also cleans up the error returned for setting congestion
control which was incorrectly returning EINVAL instead of ENOENT.

PiperOrigin-RevId: 252889093
2019-06-12 13:35:50 -07:00
Bhasker Hariharan 3933dd5c04 Fixes to listen backlog handling.
Changes netstack to confirm to current linux behaviour where if the backlog is
full then we drop the SYN and do not send a SYN-ACK. Similarly we allow upto
backlog connections to be in SYN-RCVD state as long as the backlog is not full.

We also now drop a SYN if syn cookies are in use and the backlog for the
listening endpoint is full.

Added new tests to confirm the behaviour.

Also reverted the change to increase the backlog in TcpPortReuseMultiThread
syscall test.

Fixes #236

PiperOrigin-RevId: 252500462
2019-06-10 15:40:44 -07:00
Kevin Krakauer 06a83df533 Address more comments.
Change-Id: I83ae1079f3dcba6b018f59ab7898decab5c211d2
2019-06-10 12:43:54 -07:00
Kevin Krakauer 8afbd974da Address Ian's comments.
Change-Id: I7445033b1970cbba3f2ed0682fe520dce02d8fad
2019-06-07 12:54:53 -07:00
Rahat Mahmood 2d2831e354 Track and export socket state.
This is necessary for implementing network diagnostic interfaces like
/proc/net/{tcp,udp,unix} and sock_diag(7).

For pass-through endpoints such as hostinet, we obtain the socket
state from the backend. For netstack, we add explicit tracking of TCP
states.

PiperOrigin-RevId: 251934850
2019-06-06 15:04:47 -07:00
Bhasker Hariharan 85be01b42d Add multi-fd support to fdbased endpoint.
This allows an fdbased endpoint to have multiple underlying fd's from which
packets can be read and dispatched/written to.

This should allow for higher throughput as well as better scalability of the
network stack as number of connections increases.

Updates #231

PiperOrigin-RevId: 251852825
2019-06-06 08:07:02 -07:00
Andrei Vagin 79f7cb6c1c netstack/sniffer: log GSO attributes
PiperOrigin-RevId: 251788534
2019-06-05 22:51:53 -07:00
Andrei Vagin a12848ffeb netstack/tcp: fix calculating a number of outstanding packets
In case of GSO, a segment can container more than one packet
and we need to use the pCount() helper to get a number of packets.

PiperOrigin-RevId: 251743020
2019-06-05 16:30:45 -07:00
Chris Kuiper d18bb4f38a Adjust route when looping multicast packets
Multicast packets are special in that their destination address does not
identify a specific interface. When sending out such a packet the multicast
address is the remote address, but for incoming packets it is the local
address. Hence, when looping a multicast packet, the route needs to be
tweaked to reflect this.

PiperOrigin-RevId: 251739298
2019-06-05 16:08:29 -07:00
Bhasker Hariharan e0fb921205 Fix data race in synRcvdState.
When checking the length of the acceptedChan we should hold the
endpoint mutex otherwise a syn received while the listening socket
is being closed can result in a data race where the cleanupLocked
routine sets acceptedChan to nil while a handshake goroutine
in progress could try and check it at the same time.

PiperOrigin-RevId: 251537697
2019-06-04 16:17:24 -07:00
Bhasker Hariharan bfe3220992 Delete debug log lines left by mistake.
Updates #236

PiperOrigin-RevId: 251337915
2019-06-03 17:00:18 -07:00
Bhasker Hariharan 3577a4f691 Disable certain tests that are flaky under race detector.
PiperOrigin-RevId: 250976665
2019-05-31 16:19:49 -07:00
Bhasker Hariharan 033f96cc93 Change segment queue limit to be of fixed size.
Netstack sets the unprocessed segment queue size to match the receive
buffer size. This is not required as this queue only needs to hold enough
for a short duration before the endpoint goroutine can process it.

Updates #230

PiperOrigin-RevId: 250976323
2019-05-31 16:17:33 -07:00
Kevin Krakauer d58eb9ce82 Add basic iptables structures to netstack.
Change-Id: Ib589906175a59dae315405a28f2d7f525ff8877f
2019-05-31 16:14:04 -07:00
Fabricio Voznika 38de91b028 Add build guard to files using go:linkname
Funcion signatures are not validated during compilation. Since
they are not exported, they can change at any time. The guard
ensures that they are verified at least on every version upgrade.

PiperOrigin-RevId: 250733742
2019-05-30 12:09:39 -07:00
Bhasker Hariharan ae26b2c425 Fixes to TCP listen behavior.
Netstack listen loop can get stuck if cookies are in-use and the app is slow to
accept incoming connections. Further we continue to complete handshake for a
connection even if the backlog is full. This creates a problem when a lots of
connections come in rapidly and we end up with lots of completed connections
just hanging around to be delivered.

These fixes change netstack behaviour to mirror what linux does as described
here in the following article

http://veithen.io/2014/01/01/how-tcp-backlog-works-in-linux.html

Now when cookies are not in-use Netstack will silently drop the ACK to a SYN-ACK
and not complete the handshake if the backlog is full.  This will result in the
connection staying in a half-complete state. Eventually the sender will
retransmit the ACK and if backlog has space we will transition to a connected
state and deliver the endpoint.

Similarly when cookies are in use we do not try and create an endpoint unless
there is space in the accept queue to accept the newly created endpoint. If
there is no space then we again silently drop the ACK as we can just recreate it
when the ACK is retransmitted by the peer.

We also now use the backlog to cap the size of the SYN-RCVD queue for a given
endpoint. So at any time there can be N connections in the backlog and N in a
SYN-RCVD state if the application is not accepting connections. Any new SYNs
will be dropped.

This CL also fixes another small bug where we mark a new endpoint which has not
completed handshake as connected. We should wait till handshake successfully
completes before marking it connected.

Updates #236

PiperOrigin-RevId: 250717817
2019-05-30 12:08:41 -07:00
Tamir Duberstein e4b395db49 Remove unused wakers
These wakers are uselessly allocated and passed around; nothing ever
listens for notifications on them. The code here appears to be
vestigial, so removing it and allowing a nil waker to be passed seems
appropriate.

PiperOrigin-RevId: 249879320
Change-Id: Icd209fb77cc0dd4e5c49d7a9f2adc32bf88b4b71
2019-05-24 12:29:14 -07:00
Kevin Krakauer c1cdf18e7b UDP and TCP raw socket support.
PiperOrigin-RevId: 249511348
Change-Id: I34539092cc85032d9473ff4dd308fc29dc9bfd6b
2019-05-22 13:45:15 -07:00
Bhasker Hariharan 2ac0aeeb42 Refactor fdbased endpoint dispatcher code.
This is in preparation to support an fdbased endpoint that can read/dispatch
packets from multiple underlying fds.

Updates #231

PiperOrigin-RevId: 249337074
Change-Id: Id7d375186cffcf55ae5e38986e7d605a96916d35
2019-05-21 15:24:25 -07:00
Nicolas Lacasse bfd9f75ba4 Set the FilesytemType in MountSource from the Filesystem.
And stop storing the Filesystem in the MountSource.

This allows us to decouple the MountSource filesystem type from the name of the
filesystem.

PiperOrigin-RevId: 247292982
Change-Id: I49cbcce3c17883b7aa918ba76203dfd6d1b03cc8
2019-05-08 14:35:06 -07:00
Googler cbf6ab9697 Check GSO for nil in WritePacket
Testing:
Unit tests added
PiperOrigin-RevId: 247096269
Change-Id: I849c010eadcb53caf45896a15ef38162d66a9568
2019-05-07 14:57:03 -07:00
Ian Gudger 20862f0db2 Add gonet.DialContextTCP.
Allows cancellation and timeouts.

PiperOrigin-RevId: 247090428
Change-Id: I91907f12e218677dcd0e0b6d72819deedbd9f20c
2019-05-07 14:27:36 -07:00
Kevin Krakauer ff8ed5e6a5 Fix raw socket behavior and tests.
Some behavior was broken due to the difficulty of running automated raw
socket tests.

Change-Id: I152ca53916bb24a0208f2dc1c4f5bc87f4724ff6
PiperOrigin-RevId: 246747067
2019-05-05 16:07:25 -07:00