Commit Graph

406 Commits

Author SHA1 Message Date
Ghanan Gowripalan a7a1f00425 Support upgrading expired/removed IPv6 addresses to permanent SLAAC addresses
If a previously added IPv6 address (statically or via SLAAC) was removed, it
would be left in an expired state waiting to be cleaned up if any references to
it were still held. During this time, the same address could be regenerated via
SLAAC, which should be allowed. This change supports this scenario.

When upgrading an endpoint from temporary or permanentExpired to permanent,
respect the new configuration type (static or SLAAC) and deprecated status,
along with the new PrimaryEndpointBehavior (which was already supported).

Test: stack.TestAutoGenAddrAfterRemoval
PiperOrigin-RevId: 289990168
2020-01-15 20:23:06 -08:00
Ghanan Gowripalan 815df2959a Solicit IPv6 routers when a NIC becomes enabled as a host
This change adds support to send NDP Router Solicitation messages when a NIC
becomes enabled as a host, as per RFC 4861 section 6.3.7.

Note, Router Solicitations will only be sent when the stack has forwarding
disabled.

Tests: Unittests to make sure that the initial Router Solicitations are sent
as configured. The tests also validate the sent Router Solicitations' fields.
PiperOrigin-RevId: 289964095
2020-01-15 17:10:48 -08:00
Bhasker Hariharan 275ac8ce1d Bugfix to terminate the protocol loop on StateError.
The change to introduce worker goroutines can cause the endpoint
to transition to StateError and we should terminate the loop rather
than let the endpoint transition to a CLOSED state as we do
in case the endpoint enters TIME-WAIT/CLOSED. Moving to a closed
state would cause the actual error to not be propagated to
any read() calls etc.

PiperOrigin-RevId: 289923568
2020-01-15 13:21:50 -08:00
Bhasker Hariharan a611fdaee3 Changes TCP packet dispatch to use a pool of goroutines.
All inbound segments for connections in ESTABLISHED state are delivered to the
endpoint's queue but for every segment delivered we also queue the endpoint for
processing to a selected processor. This ensures that when there are a large
number of connections in ESTABLISHED state the inbound packets are all handled
by a small number of goroutines and significantly reduces the amount of work the
goscheduler has to perform.

We let connections in other states follow the current path where the
endpoint's goroutine directly handles the segments.

Updates #231

PiperOrigin-RevId: 289728325
2020-01-14 14:15:50 -08:00
Tamir Duberstein 50625cee59 Implement {g,s}etsockopt(IP_RECVTOS) for UDP sockets
PiperOrigin-RevId: 289718534
2020-01-14 13:33:23 -08:00
Ghanan Gowripalan 1ad8381eac Do Source Address Selection when choosing an IPv6 source address
Do Source Address Selection when choosing an IPv6 source address as per RFC 6724
section 5 rules 1-3:
1) Prefer same address
2) Prefer appropriate scope
3) Avoid deprecated addresses.

A later change will update Source Address Selection to follow rules 4-8.

Tests:
Rule 1 & 2: stack.TestIPv6SourceAddressSelectionScopeAndSameAddress,
Rule 3:     stack.TestAutoGenAddrTimerDeprecation,
            stack.TestAutoGenAddrDeprecateFromPI
PiperOrigin-RevId: 289559373
2020-01-13 17:58:28 -08:00
Tamir Duberstein debd213da6 Allow dual stack sockets to operate on AF_INET
Fixes #1490
Fixes #1495

PiperOrigin-RevId: 289523250
2020-01-13 14:47:22 -08:00
gVisor bot b30cfb1df7 Merge pull request #1528 from kevinGC:iptables-write
PiperOrigin-RevId: 289479774
2020-01-13 11:26:26 -08:00
Ghanan Gowripalan d27208463e Automated rollback of changelist 288990597
PiperOrigin-RevId: 289169518
2020-01-10 14:58:47 -08:00
Ghanan Gowripalan bcedf6a8e4 Put CancellableTimer tests in the tcpip_test package
CancellableTimer tests were in a timer_test package but lived within the
tcpip directory. This caused issues with go tools.

PiperOrigin-RevId: 289166345
2020-01-10 14:32:17 -08:00
Bhasker Hariharan dacd349d6f panic fix in retransmitTimerExpired.
This is a band-aid fix for now to prevent panics.

PiperOrigin-RevId: 289078453
2020-01-10 06:03:02 -08:00
Ian Gudger 27500d529f New sync package.
* Rename syncutil to sync.
* Add aliases to sync types.
* Replace existing usage of standard library sync package.

This will make it easier to swap out synchronization primitives. For example,
this will allow us to use primitives from github.com/sasha-s/go-deadlock to
check for lock ordering violations.

Updates #1472

PiperOrigin-RevId: 289033387
2020-01-09 22:02:24 -08:00
gVisor bot b08da42285 Merge pull request #1523 from majek:fix-1522-silly-window-rx
PiperOrigin-RevId: 289019953
2020-01-09 19:35:27 -08:00
Ghanan Gowripalan 26c5653bb5 Inform NDPDispatcher when Stack learns about available configurations via DHCPv6
Inform the Stack's NDPDispatcher when it receives an NDP Router Advertisement
that updates the available configurations via DHCPv6. The Stack makes sure that
its NDPDispatcher isn't informed unless the avaiable configurations via DHCPv6
for a NIC is updated.

Tests: Test that a Stack's NDPDispatcher is informed when it receives an NDP
Router Advertisement that informs it of new configurations available via DHCPv6.
PiperOrigin-RevId: 289001283
2020-01-09 16:56:28 -08:00
Ghanan Gowripalan 8fafd3142e Separate NDP tests into its own package
Internal tools timeout after 60s during tests that are required to pass before
changes can be submitted. Separate out NDP tests into its own package to help
prevent timeouts when testing.

PiperOrigin-RevId: 288990597
2020-01-09 15:56:44 -08:00
Eyal Soha 8643933d6e Change BindToDeviceOption to store NICID
This makes it possible to call the sockopt from go even when the NIC has no
name.

PiperOrigin-RevId: 288955236
2020-01-09 13:07:53 -08:00
Bert Muthalaly e752ddbb72 Allow clients to store an opaque NICContext with NICs
...retrievable later via stack.NICInfo().

Clients of this library can use it to add metadata that should be tracked
alongside a NIC, to avoid having to keep a map[tcpip.NICID]metadata mirroring
stack.Stack's nic map.

PiperOrigin-RevId: 288924900
2020-01-09 10:46:01 -08:00
Ghanan Gowripalan d057871f41 CancellableTimer to encapsulate the work of safely stopping timers
Add a new CancellableTimer type to encapsulate the work of safely stopping
timers when it fires at the same time some "related work" is being handled. The
term "related work" is some work that needs to be done while having obtained
some common lock (L).

Example: Say we have an invalidation timer that may be extended or cancelled by
some event. Creating a normal timer and simply cancelling may not be sufficient
as the timer may have already fired when the event handler attemps to cancel it.
Even if the timer and event handler obtains L before doing work, once the event
handler releases L, the timer will eventually obtain L and do some unwanted
work.

To prevent the timer from doing unwanted work, it checks if it should early
return instead of doing the normal work after obtaining L. When stopping the
timer callers must have L locked so the timer can be safely informed that it
should early return.

Test: Tests that CancellableTimer fires and resets properly. Test to make sure
the timer fn is not called after being stopped within the lock L.
PiperOrigin-RevId: 288806984
2020-01-08 17:50:54 -08:00
Kevin Krakauer ae060a63d9 More GH comments. 2020-01-08 17:30:08 -08:00
Tamir Duberstein d530df2f95 Introduce tcpip.SockOptBool
...and port V6OnlyOption to it.

PiperOrigin-RevId: 288789451
2020-01-08 15:40:48 -08:00
Bert Muthalaly e21c584056 Combine various Create*NIC methods into CreateNICWithOptions.
PiperOrigin-RevId: 288779416
2020-01-08 14:50:49 -08:00
Tamir Duberstein a271bccfc6 Rename tcpip.SockOpt{,Int}
PiperOrigin-RevId: 288772878
2020-01-08 14:20:07 -08:00
Kevin Krakauer 446a250996 Comment cleanup. 2020-01-08 11:20:48 -08:00
Kevin Krakauer 1e1921e2ac Minor fixes to comments and logging 2020-01-08 11:15:46 -08:00
Tamir Duberstein 9df018767c Remove redundant function argument
PacketLooping is already a member on the passed Route.

PiperOrigin-RevId: 288721500
2020-01-08 10:22:51 -08:00
Kevin Krakauer 8cc1c35bbd Write simple ACCEPT rules to the filter table.
This gets us closer to passing the iptables tests and opens up iptables
so it can be worked on by multiple people.

A few restrictions are enforced for security (i.e. we don't want to let
users write a bunch of iptables rules and then just not enforce them):

- Only the filter table is writable.
- Only ACCEPT rules with no matching criteria can be added.
2020-01-08 10:08:14 -08:00
Bert Muthalaly 0cc1e74b57 Add NIC.isLoopback()
...enabling us to remove the "CreateNamedLoopbackNIC" variant of
CreateNIC and all the plumbing to connect it through to where the value
is read in FindRoute.

PiperOrigin-RevId: 288713093
2020-01-08 09:30:20 -08:00
Marek Majkowski c276e4740f Fix #1522 - implement silly window sydrome protection on rx side
Before, each of small read()'s that raises window either from zero
or above threshold of aMSS, would generate an ACK. In a classic
silly-window-syndrome scenario, we can imagine a pessimistic case
when small read()'s generate a stream of ACKs.

This PR fixes that, essentially treating window size < aMSS as zero.
We send ACK exactly in a moment when window increases to >= aMSS
or half of receive buffer size (whichever smaller).
2020-01-08 12:56:39 +00:00
Ghanan Gowripalan 4e19d165cc Support deprecating SLAAC addresses after the preferred lifetime
Support deprecating network endpoints on a NIC. If an endpoint is deprecated, it
should not be used for new connections unless a more preferred endpoint is not
available, or unless the deprecated endpoint was explicitly requested.

Test: Test that deprecated endpoints are only returned when more preferred
endpoints are not available and SLAAC addresses are deprecated after its
preferred lifetime
PiperOrigin-RevId: 288562705
2020-01-07 13:42:08 -08:00
Marek Majkowski 08a97a6d19 #1398 - send ACK when available buffer space gets larger than 1 MSS
When receiving data, netstack avoids sending spurious acks. When
user does recv() should netstack send ack telling the sender that
the window was increased? It depends. Before this patch, netstack
_will_ send the ack in the case when window was zero or window >>
scale was zero. Basically - when recv space increased from zero.

This is not working right with silly-window-avoidance on the sender
side. Some network stacks refuse to transmit segments, that will fill
the window but are below MSS. Before this patch, this confuses
netstack. On one hand if the window was like 3 bytes, netstack
will _not_ send ack if the window increases. On the other hand
sending party will refuse to transmit 3-byte packet.

This patch changes that, making netstack will send an ACK when
the available buffer size increases to or above 1*MSS. This will
inform other party buffer is large enough, and hopefully uncork it.


Signed-off-by: Marek Majkowski <marek@cloudflare.com>
2020-01-07 20:35:39 +00:00
Ghanan Gowripalan 2031cc4701 Disable auto-generation of IPv6 link-local addresses for loopback NICs
Test: Test that an IPv6 link-local address is not auto-generated for loopback
NICs, even when it is enabled for non-loopback NICS.
PiperOrigin-RevId: 288519591
2020-01-07 10:09:39 -08:00
Ghanan Gowripalan 8dfd922840 Pass the NIC-internal name to the NIC name function when generating opaque IIDs
Pass the NIC-internal name to the NIC name function when generating opaque IIDs
so implementations can use the name that was provided when the NIC was created.
Previously, explicit NICID to NIC name resolution was required from the netstack
integrator.

Tests: Test that the name provided when creating a NIC is passed to the NIC name
function when generating opaque IIDs.
PiperOrigin-RevId: 288395359
2020-01-06 16:21:31 -08:00
Ghanan Gowripalan 83ab47e87b Use opaque interface identifiers when generating IPv6 addresses via SLAAC
Support using opaque interface identifiers when generating IPv6 addresses via
SLAAC when configured to do so.

Note, this change does not handle retries in response to DAD conflicts yet.
That will also come in a later change.

Test: Test that when SLAAC addresses are generated, they use opaque interface
identifiers when configured to do so.
PiperOrigin-RevId: 288078605
2020-01-03 18:28:22 -08:00
Ghanan Gowripalan d1d878a801 Support generating opaque interface identifiers as defined by RFC 7217
Support generating opaque interface identifiers as defined by RFC 7217 for
auto-generated IPv6 link-local addresses. Opaque interface identifiers will also
be used for IPv6 addresses auto-generated via SLAAC in a later change.

Note, this change does not handle retries in response to DAD conflicts yet.
That will also come in a later change.

Tests: Test that when configured to generated opaque IIDs, they are properly
generated as outlined by RFC 7217.
PiperOrigin-RevId: 288035349
2020-01-03 13:00:19 -08:00
gVisor bot 87e4d03fdf Automated rollback of changelist 287029703
PiperOrigin-RevId: 287217899
2019-12-26 13:05:52 -08:00
Ryan Heacock e013c48c78 Enable IP_RECVTOS socket option for datagram sockets
Added the ability to get/set the IP_RECVTOS socket option on UDP endpoints. If
enabled, TOS from the incoming Network Header passed as ancillary data in the
ControlMessages.

Test:
* Added unit test to udp_test.go that tests getting/setting as well as
verifying that we receive expected TOS from incoming packet.
* Added a syscall test
PiperOrigin-RevId: 287029703
2019-12-24 08:49:39 -08:00
Ghanan Gowripalan 5bc4ae9d57 Clear any host-specific NDP state when becoming a router
This change supports clearing all host-only NDP state when NICs become routers.
All discovered routers, discovered on-link prefixes and auto-generated addresses
will be invalidated when becoming a router. This is because normally, routers do
not process Router Advertisements to discover routers or on-link prefixes, and
do not do SLAAC.

Tests: Unittest to make sure that all discovered routers, discovered prefixes
and auto-generated addresses get invalidated when transitioning from a host to
a router.
PiperOrigin-RevId: 286902309
2019-12-23 08:55:25 -08:00
Kevin Krakauer 08c39e2587 Change TODO to track correct bug.
PiperOrigin-RevId: 286639163
2019-12-20 14:19:21 -08:00
Andrei Vagin 57ce26c0b4 net/tcp: allow to call listen without bind
When listen(2) is called on an unbound socket, the socket is
automatically bound to a random free port with the local address
set to INADDR_ANY.

PiperOrigin-RevId: 286305906
2019-12-18 18:24:17 -08:00
Ghanan Gowripalan 8e6e87f8e8 Allow 'out-of-line' routing table updates for Router and Prefix discovery events
This change removes the requirement that a new routing table be provided when a
router or prefix discovery event happens so that an updated routing table may
be provided to the stack at a later time from the event.

This change is to address the use case where the netstack integrator may need to
obtain a lock before providing updated routes in response to the events above.

As an example, say we have an integrator that performs the below two operations
operations as described:
A. Normal route update:
  1. Obtain integrator lock
  2. Update routes in the integrator
  3. Call Stack.SetRouteTable with the updated routes
    3.1. Obtain Stack lock
    3.2. Update routes in Stack
    3.3. Release Stack lock
  4. Release integrator lock
B. NDP event triggered route update:
  1. Obtain Stack lock
  2. Call event handler
    2.1. Obtain integrator lock
    2.2. Update routes in the integrator
    2.3. Release integrator lock
    2.4. Return updated routes to update Stack
  3. Update routes in Stack
  4. Release Stack lock

A deadlock may occur if a Normal route update was attemped at the same time an
NDP event triggered route update was attempted. With threads T1 and T2:
1) T1 -> A.1, A.2
2) T2 -> B.1
3) T1 -> A.3 (hangs at A.3.1 since Stack lock is taken in step 2)
4) T2 -> B.2 (hangs at B.2.1 since integrator lock is taken in step 1)

Test: Existing tests were modified to not provide or expect routing table
changes in response to Router and Prefix discovery events.
PiperOrigin-RevId: 286274712
2019-12-18 15:18:33 -08:00
Ghanan Gowripalan 628948b1e1 Cleanup NDP Tests
This change makes sure that test variables are captured before running tests
in parallel, and removes unneeded buffered channel allocations. This change also
removes unnecessary timeouts.

PiperOrigin-RevId: 286255066
2019-12-18 14:24:39 -08:00
gVisor bot 3f4d8fefb4 Internal change.
PiperOrigin-RevId: 286003946
2019-12-17 10:10:06 -08:00
Ghanan Gowripalan ad80dcf470 Properly generate the EUI64 interface identifier from an Ethernet address
Fixed a bug where the interface identifier was not properly generated from an
Ethernet address.

Tests: Unittests to make sure the functions generating the EUI64 interface
identifier are correct.
PiperOrigin-RevId: 285494562
2019-12-13 16:41:41 -08:00
Bhasker Hariharan 6fc9f0aefd Add support for TCP_USER_TIMEOUT option.
The implementation follows the linux behavior where specifying
a TCP_USER_TIMEOUT will cause the resend timer to honor the
user specified timeout rather than the default rto based timeout.

Further it alters when connections are timedout due to keepalive
failures. It does not alter the behavior of when keepalives are
sent. This is as per the linux behavior.

PiperOrigin-RevId: 285099795
2019-12-11 17:52:53 -08:00
Michael Pratt 0d027262e0 Add additional packages to go branch
We're missing several packages that runsc doesn't depend on. Most notable are
several tcpip link packages.

To find packages, I looked at a diff of directories on master vs go:

$ bazel build //:gopath
$ find bazel-bin/gopath/src/gvisor.dev/gvisor/ -type d > /tmp/gopath.txt
$ find . -type d > /tmp/master.txt
$ sed 's|bazel-bin/gopath/src/gvisor.dev/gvisor/||' < /tmp/gopath.txt > /tmp/gopath.trunc.txt
$ sed 's|./||' < /tmp/master.txt > /tmp/master.trunc.txt
$ vimdiff /tmp/gopath.trunc.txt /tmp/master.trunc.txt

Testing packages are still left out because :gopath can't depend on testonly
targets...

PiperOrigin-RevId: 285049029
2019-12-11 14:22:36 -08:00
Ghanan Gowripalan 4ff71b5be4 Inform the integrator on receipt of an NDP Recursive DNS Server option
This change adds support to let an integrator know when it receives an NDP
Router Advertisement message with the NDP Recursive DNS Server option with at
least one DNS server's address. The stack will not maintain any state related to
the DNS servers - the integrator is expected to maintain any required state and
invalidate the servers after its valid lifetime expires, or refresh the lifetime
when a new one is received for a known DNS server.

Test: Unittest to make sure that an event is sent to the integrator when an NDP
Recursive DNS Server option is received with at least one address.
PiperOrigin-RevId: 284890502
2019-12-10 18:05:23 -08:00
Ian Gudger 18af75db9d Add UDP SO_REUSEADDR support to the port manager.
Next steps include adding support to the transport demuxer and the UDP endpoint.

PiperOrigin-RevId: 284652151
2019-12-09 15:53:00 -08:00
Mithun Iyer b1d44be7ad Add TCP stats for connection close and keep-alive timeouts.
Fix bugs in updates to TCP CurrentEstablished stat.

Fixes #1277

PiperOrigin-RevId: 284292459
2019-12-06 17:17:33 -08:00
Bhasker Hariharan 3e84777d2e Fix flakiness in tcp_test.
This change marks the socket as ESTABLISHED and creates the receiver and sender
the moment we send the final ACK in case of an active TCP handshake or when we
receive the final ACK for a passive TCP handshake. Before this change there was
a short window in which an ACK can be received and processed but the state on
the socket is not yet ESTABLISHED.

This can be seen in TestConnectBindToDevice which is flaky because sometimes
the socket is in SYN-SENT and not ESTABLISHED even though the other side has
already received the final ACK of the handshake.

PiperOrigin-RevId: 284277713
2019-12-06 15:46:26 -08:00
Ghanan Gowripalan ab3f7bc393 Do IPv6 Stateless Address Auto-Configuration (SLAAC)
This change allows the netstack to do SLAAC as outlined by RFC 4862 section 5.5.

Note, this change will not break existing uses of netstack as the default
configuration for the stack options is set in such a way that SLAAC
will not be performed. See `stack.Options` and `stack.NDPConfigurations` for
more details.

This change reuses 1 option and introduces a new one that is required to take
advantage of SLAAC, all available under NDPConfigurations:
- HandleRAs: Whether or not NDP RAs are processes
- AutoGenGlobalAddresses: Whether or not SLAAC is performed.

Also note, this change does not deprecate SLAAC generated addresses after the
preferred lifetime. That will come in a later change (b/143713887). Currently,
only the valid lifetime is honoured.

Tests: Unittest to make sure that SLAAC generates and adds addresses only when
configured to do so. Tests also makes sure that conflicts with static addresses
do not modify the static address.
PiperOrigin-RevId: 284265317
2019-12-06 14:41:30 -08:00