Commit Graph

285 Commits

Author SHA1 Message Date
Tamir Duberstein 0348edc3cb Remove unnecessary code
Remove useless casts and duplicate return statements.

PiperOrigin-RevId: 306627916
2020-04-15 06:05:38 -07:00
Mithun Iyer 9c918340e4 Reset pending connections on listener close
Attempt to redeliver TCP segments that are enqueued into a closing
TCP endpoint. This was being done for Established endpoints but not
for those that are listening or performing connection handshake.

Fixes #2417

PiperOrigin-RevId: 306598155
2020-04-15 01:11:44 -07:00
Bhasker Hariharan 28212b3f17 Reduce flakiness in tcp_test.
Tests now use a MinRTO of 3s instead of default 200ms. This reduced flakiness in
a lot of the congestion control/recovery tests which were flaky due to
retransmit timer firing too early in case the test executors were overloaded.

This change also bumps some of the timeouts in tests which were too sensitive to
timer variations and reduces the number of slow start iterations which can
make the tests run for too long and also trigger retansmit timeouts etc if
the executor is overloaded.

PiperOrigin-RevId: 306562645
2020-04-14 19:33:35 -07:00
gVisor bot 78126611e6 Merge pull request #2253 from amscanne:nogo
PiperOrigin-RevId: 305807868
2020-04-09 19:16:46 -07:00
Andrei Vagin 7928aa345e Convert int and bool socket options to use GetSockOptInt and GetSockOptBool
PiperOrigin-RevId: 305699233
2020-04-09 09:31:48 -07:00
Adin Scannell 867eeb18d8 Remove lostcancel warnings.
Updates #2243
2020-04-08 10:14:34 -07:00
Adin Scannell 928a7c60b8 Fix all printf formatting errors.
Updates #2243
2020-04-08 10:14:34 -07:00
Bhasker Hariharan fc99a7ebf0 Refactor software GSO code.
Software GSO implementation currently has a complicated code path with
implicit assumptions that all packets to WritePackets carry same Data
and it does this to avoid allocations on the path etc. But this makes it
hard to reuse the WritePackets API.

This change breaks all such assumptions by introducing a new Vectorised
View API ReadToVV which can be used to cleanly split a VV into multiple
independent VVs. Further this change also makes packet buffers linkable
to form an intrusive list. This allows us to get rid of the array of
packet buffers that are passed in the WritePackets API call and replace
it with a list of packet buffers.

While this code does introduce some more allocations in the benchmarks
it doesn't cause any degradation.

Updates #231

PiperOrigin-RevId: 304731742
2020-04-03 18:35:55 -07:00
Nayana Bidari 92b9069b67 Support owner matching for iptables.
This feature will match UID and GID of the packet creator, for locally
generated packets. This match is only valid in the OUTPUT and POSTROUTING
chains. Forwarded packets do not have any socket associated with them.
Packets from kernel threads do have a socket, but usually no owner.
2020-03-26 12:21:24 -07:00
Bhasker Hariharan d04adebaab Fix data-race in endpoint.Readiness
PiperOrigin-RevId: 302924789
2020-03-25 10:55:22 -07:00
Bhasker Hariharan c8eeedcc1d Add support for setting TCP segment hash.
This allows the link layer endpoints to consistenly hash a TCP
segment to a single underlying queue in case a link layer endpoint
does support multiple underlying queues.

Updates #231

PiperOrigin-RevId: 302760664
2020-03-24 15:34:43 -07:00
Bhasker Hariharan 7e4073af12 Move tcpip.PacketBuffer and IPTables to stack package.
This is a precursor to be being able to build an intrusive list
of PacketBuffers for use in queuing disciplines being implemented.

Updates #2214

PiperOrigin-RevId: 302677662
2020-03-24 09:06:26 -07:00
Ting-Yu Wang 49aef9cee7 Remove unused variable `sndNxtList`.
PiperOrigin-RevId: 302110328
2020-03-20 15:25:15 -07:00
Jay Zhuang 8b461aa36b Remove redundant dep in BUILD
PiperOrigin-RevId: 301859066
2020-03-19 11:34:49 -07:00
Bhasker Hariharan fd27a917ef Address comments on workMu removal change.
Updates #231, #357

PiperOrigin-RevId: 301833669
2020-03-19 09:43:23 -07:00
Bhasker Hariharan e9e399c25d Remove workMu from tcpip.Endpoint.
workMu is removed and e.mu is now a mutex that supports TryLock.  The packet
processing path tries to lock the mutex and if its locked it will just queue the
packet and move on. The endpoint.UnlockUser() will process any backlog of
packets before unlocking the socket.

This simplifies the locking inside tcp endpoints a lot. Further the
endpoint.LockUser() implements spinning as long as the lock is not held by
another syscall goroutine. This ensures low latency as not spinning leads to the
task thread being put to sleep if the lock is held by the packet dispatch
path. This is suboptimal as the lower layer rarely holds the lock for long so
implementing spinning here helps.

If the lock is held by another task goroutine then we just proceed to call
LockUser() and the task could be put to sleep.

The protocol goroutines themselves just call e.mu.Lock() and block if the
lock is currently not available.

Updates #231, #357

PiperOrigin-RevId: 301808349
2020-03-19 07:19:58 -07:00
Ian Gudger 92a00ca91a Store segment transmit count.
This will aid in segment reordering detection.

Updates #691

PiperOrigin-RevId: 301692638
2020-03-18 16:26:36 -07:00
Tamir Duberstein ac05043525 Implement heap.Interface on pointer receiver
PiperOrigin-RevId: 300467253
2020-03-11 20:38:05 -07:00
Tamir Duberstein 538e35f61b Fix race condition (*tcp.endpoint).Close
Atomically close the endpoint. Before this change, it was possible for
multiple callers to perform duplicate work.

PiperOrigin-RevId: 300462110
2020-03-11 19:57:25 -07:00
Bhasker Hariharan 81675b850e Fix memory leak in danglingEndpoints.
Endpoints which were being terminated in an ERROR state or were moved to CLOSED
by the worker goroutine do not run cleanupLocked() as that should already be run
by the worker termination. But when making that change we made the mistake of
not removing the endpoint from the danglingEndpoints which is normally done in
cleanupLocked().

As a result these endpoints are leaked since a reference is held to them in the
danglingEndpoints array forever till Stack is torn down.

PiperOrigin-RevId: 300438426
2020-03-11 17:03:57 -07:00
Eyal Soha d5dbe366bf shutdown(s, SHUT_WR) in TIME-WAIT returns ENOTCONN
From RFC 793 s3.9 p61 Event Processing:

CLOSE Call during TIME-WAIT: return with "error: connection closing"

Fixes #1603

PiperOrigin-RevId: 299401353
2020-03-06 11:42:34 -08:00
Ian Gudger 9b3aad33c4 Use a pool of arrays to avoid slice headers from escaping in TCP options pool.
By putting slices into the pool, the slice header escapes. This can be avoided
by not putting the slice header into the pool.

This removes an allocation from the TCP segment send path.

PiperOrigin-RevId: 299215480
2020-03-05 15:56:42 -08:00
Tamir Duberstein 371abe00f0 Avoid memory leaks
Properly discard segments from the segment heap.

PiperOrigin-RevId: 298704074
2020-03-03 15:07:09 -08:00
Ian Gudger c15b8515eb Fix datarace on TransportEndpointInfo.ID and clean up semantics.
Ensures that all access to TransportEndpointInfo.ID is either:
* In a function ending in a Locked suffix.
* While holding the appropriate mutex.

This primary affects the checkV4Mapped method on affected endpoints, which has
been renamed to checkV4MappedLocked. Also document the method and change its
argument to be a value instead of a pointer which had caused some awkwardness.

This race was possible in the udp and icmp endpoints between Connect and uses
of TransportEndpointInfo.ID including in both itself and Bind.

The tcp endpoint did not suffer from this bug, but benefited from better
documentation.

Updates #357

PiperOrigin-RevId: 298682913
2020-03-03 13:42:13 -08:00
Bhasker Hariharan 3310175250 Fix data-race when reading/writing e.amss.
PiperOrigin-RevId: 298451319
2020-03-02 14:45:03 -08:00
Ian Gudger c6bdc6b05b Fix a race in TCP endpoint teardown and teardown the stack in tcp_test.
Call stack.Close on stacks when we are done with them in tcp_test. This avoids
leaking resources and reduces the test's flakiness when race/gotsan is enabled.
It also provides test coverage for the race also fixed in this change, which
can be reliably triggered with the stack.Close change (and without the other
changes) when race/gotsan is enabled.

The race was possible when calling Abort (via stack.Close) on an endpoint
processing a SYN segment as part of a passive connect.

Updates #1564

PiperOrigin-RevId: 297685432
2020-02-27 14:15:44 -08:00
Nayana Bidari abf7ebcd38 Internal change.
PiperOrigin-RevId: 297638665
2020-02-27 11:00:41 -08:00
Bhasker Hariharan d7b7379251 Deflake TestCurrentConnectedIncrement.
TestCurrentConnectedIncrement fails consistently under gotsan due to the sleep
to check metrics is exactly the same as the TIME-WAIT duration. Under gotsan
things can be slow enough that the increment test is done before the protocol
goroutine is run after the TIME-WAIT timer expires and does its cleanup.

Increasing the sleep from 1s to 1.2s makes the test pass consistently.

PiperOrigin-RevId: 297160181
2020-02-25 11:19:34 -08:00
Ian Gudger c37b196455 Add support for tearing down protocol dispatchers and TIME_WAIT endpoints.
Protocol dispatchers were previously leaked. Bypassing TIME_WAIT is required to
test this change.

Also fix a race when a socket in SYN-RCVD is closed. This is also required to
test this change.

PiperOrigin-RevId: 296922548
2020-02-24 10:32:17 -08:00
gVisor bot 56fd9504aa Enable IPV6_RECVTCLASS socket option for datagram sockets
Added the ability to get/set the IP_RECVTCLASS socket option on UDP endpoints.
If enabled, traffic class from the incoming Network Header passed as ancillary
data in the ControlMessages.

Adding Get/SetSockOptBool to decrease the overhead of getting/setting simple
options. (This was absorbed in a CL that will be landing before this one).

Test:
* Added unit test to udp_test.go that tests getting/setting as well as
verifying that we receive expected TOS from incoming packet.
* Added a syscall test for verifying getting/setting
* Removed test skip for existing syscall test to enable end to end test.
PiperOrigin-RevId: 295840218
2020-02-18 15:45:36 -08:00
gVisor bot 69bf39e8a4 Internal change.
PiperOrigin-RevId: 294952610
2020-02-13 10:59:52 -08:00
Adin Scannell 1b6a12a768 Add notes to relevant tests.
These were out-of-band notes that can help provide additional context
and simplify automated imports.

PiperOrigin-RevId: 293525915
2020-02-05 22:46:35 -08:00
Eyal Soha f3d9560703 recv() on a closed TCP socket returns ENOTCONN
From RFC 793 s3.9 p58 Event Processing:

If RECEIVE Call arrives in CLOSED state and the user has access to such a
connection, the return should be "error: connection does not exist"

Fixes #1598

PiperOrigin-RevId: 293494287
2020-02-05 17:56:42 -08:00
Ian Gudger a26a954946 Add socket connection stress test.
Tests 65k connection attempts on common types of sockets to check for port
leaks.

Also fixes a bug where dual-stack sockets wouldn't properly re-queue
segments received while closing.

PiperOrigin-RevId: 293241166
2020-02-04 15:54:49 -08:00
Ian Gudger 02997af5ab Fix method comment to match method name.
PiperOrigin-RevId: 292624867
2020-01-31 15:09:13 -08:00
Ghanan Gowripalan 77bf586db7 Use multicast Ethernet address for multicast NDP
As per RFC 2464 section 7, an IPv6 packet with a multicast destination
address is transmitted to the mapped Ethernet multicast address.

Test:
- ipv6.TestLinkResolution
- stack_test.TestDADResolve
- stack_test.TestRouterSolicitation
PiperOrigin-RevId: 292610529
2020-01-31 13:55:46 -08:00
Bhasker Hariharan 4ee64a248e Fix for panic in endpoint.Close().
When sending a RST on shutdown we need to double check the
state after acquiring the work mutex as the endpoint could
have transitioned out of a connected state from the time
we checked it and we acquired the workMutex.

I added two tests but sadly neither reproduce the panic. I am
going to leave the tests in as they are good to have anyway.

PiperOrigin-RevId: 292393800
2020-01-30 12:00:35 -08:00
Bhasker Hariharan 51b783505b Add support for TCP_DEFER_ACCEPT.
PiperOrigin-RevId: 292233574
2020-01-29 15:53:45 -08:00
Ting-Yu Wang 6b14be4246 Refactor to hide C from channel.Endpoint.
This is to aid later implementation for /dev/net/tun device.

PiperOrigin-RevId: 291746025
2020-01-27 12:31:47 -08:00
Adin Scannell d29e59af9f Standardize on tools directory.
PiperOrigin-RevId: 291745021
2020-01-27 12:21:00 -08:00
Mithun Iyer 7e6fbc6afe Add a new TCP stat for current open connections.
Such a stat accounts for all connections that are currently
established and not yet transitioned to close state.
Also fix bug in double increment of CurrentEstablished stat.

Fixes #1579

PiperOrigin-RevId: 290827365
2020-01-21 15:43:39 -08:00
gVisor bot 5f82f092e7 Merge pull request #1558 from kevinGC:iptables-write-input-drop
PiperOrigin-RevId: 290793754
2020-01-21 12:08:52 -08:00
Eyal Soha 47d85257d3 Filter out received packets with a local source IP address.
CERT Advisory CA-96.21 III. Solution advises that devices drop packets which
could not have correctly arrived on the wire, such as receiving a packet where
the source IP address is owned by the device that sent it.

Fixes #1507

PiperOrigin-RevId: 290378240
2020-01-17 18:26:20 -08:00
Bhasker Hariharan 275ac8ce1d Bugfix to terminate the protocol loop on StateError.
The change to introduce worker goroutines can cause the endpoint
to transition to StateError and we should terminate the loop rather
than let the endpoint transition to a CLOSED state as we do
in case the endpoint enters TIME-WAIT/CLOSED. Moving to a closed
state would cause the actual error to not be propagated to
any read() calls etc.

PiperOrigin-RevId: 289923568
2020-01-15 13:21:50 -08:00
Bhasker Hariharan a611fdaee3 Changes TCP packet dispatch to use a pool of goroutines.
All inbound segments for connections in ESTABLISHED state are delivered to the
endpoint's queue but for every segment delivered we also queue the endpoint for
processing to a selected processor. This ensures that when there are a large
number of connections in ESTABLISHED state the inbound packets are all handled
by a small number of goroutines and significantly reduces the amount of work the
goscheduler has to perform.

We let connections in other states follow the current path where the
endpoint's goroutine directly handles the segments.

Updates #231

PiperOrigin-RevId: 289728325
2020-01-14 14:15:50 -08:00
Tamir Duberstein 50625cee59 Implement {g,s}etsockopt(IP_RECVTOS) for UDP sockets
PiperOrigin-RevId: 289718534
2020-01-14 13:33:23 -08:00
Kevin Krakauer 1c3d3c70b9 Fix test building. 2020-01-13 14:54:32 -08:00
Tamir Duberstein debd213da6 Allow dual stack sockets to operate on AF_INET
Fixes #1490
Fixes #1495

PiperOrigin-RevId: 289523250
2020-01-13 14:47:22 -08:00
Bhasker Hariharan dacd349d6f panic fix in retransmitTimerExpired.
This is a band-aid fix for now to prevent panics.

PiperOrigin-RevId: 289078453
2020-01-10 06:03:02 -08:00
Ian Gudger 27500d529f New sync package.
* Rename syncutil to sync.
* Add aliases to sync types.
* Replace existing usage of standard library sync package.

This will make it easier to swap out synchronization primitives. For example,
this will allow us to use primitives from github.com/sasha-s/go-deadlock to
check for lock ordering violations.

Updates #1472

PiperOrigin-RevId: 289033387
2020-01-09 22:02:24 -08:00