Commit Graph

802 Commits

Author SHA1 Message Date
Ghanan Gowripalan e0f4e46e34 Resolve static link addresses in GetLinkAddress
If a network address has a static mapping to a link address, calculate
it in GetLinkAddress.

Test: stack_test.TestStaticGetLinkAddress
PiperOrigin-RevId: 353179616
2021-01-21 23:26:40 -08:00
Toshi Kikuchi cfbf209173 iptables: support matching the input interface name
We have support for the output interface name, but not for the input
interface name.
This change adds the support for the input interface name, and adds the
test cases for it.

Fixes #5300

PiperOrigin-RevId: 353179389
2021-01-21 23:19:19 -08:00
gVisor bot 159b86125b Merge release-20210112.0-64-g9f46328e1 (automated) 2021-01-22 04:09:55 +00:00
Ghanan Gowripalan 9f46328e11 Only use callback for GetLinkAddress
GetLinkAddress's callback will be called immediately with a
stack.LinkResolutionResult which will hold the link address
so no need to also return the link address from the function.

Fixes #5151.

PiperOrigin-RevId: 353157857
2021-01-21 19:55:37 -08:00
gVisor bot 3cb95a5af1 Merge release-20210112.0-63-g8ecff1890 (automated) 2021-01-22 00:58:14 +00:00
Ghanan Gowripalan 8ecff18902 Do not cache remote link address in Route
...unless explicitly requested via ResolveWith.

Remove cancelled channels from pending packets as we can use the link
resolution channel in a FIFO to limit the number of maximum pending
resolutions we should queue packets for.

This change also defers starting the goroutine that handles link
resolution completion to when link resolution succeeds, fails or
gets cancelled due to the max number of pending resolutions being
reached.

Fixes #751.

PiperOrigin-RevId: 353130577
2021-01-21 16:40:06 -08:00
gVisor bot 0827b8e6da Merge release-20210112.0-60-g89df5a681 (automated) 2021-01-21 23:09:42 +00:00
Ghanan Gowripalan 89df5a681c Queue packets in WritePackets when resolving link address
Test: integration_test.TestWritePacketsLinkResolution

Fixes #4458.

PiperOrigin-RevId: 353108826
2021-01-21 14:54:14 -08:00
gVisor bot 77c19d832b Merge release-20210112.0-59-g0ca4cf769 (automated) 2021-01-21 22:25:09 +00:00
Ghanan Gowripalan 0ca4cf7698 Populate EgressRoute, GSO, Netproto in NIC
fdbased and qdisc layers expect these fields to already be
populated before being reached.

PiperOrigin-RevId: 353099492
2021-01-21 14:10:37 -08:00
Michaël Lévesque-Dion 9ea1a875eb rewrite diff check to match example in cmp.Diff docs 2021-01-20 14:03:35 -05:00
gVisor bot 49541ed7a0 Merge release-20210112.0-47-g7ff5ceaea (automated) 2021-01-20 01:12:32 +00:00
Ghanan Gowripalan 7ff5ceaeae Do not have a stack-wide linkAddressCache
Link addresses are cached on a per NIC basis so instead of having a
single cache that includes the NIC ID for neighbor entry lookups,
use a single cache per NIC.

PiperOrigin-RevId: 352684111
2021-01-19 16:56:49 -08:00
gVisor bot 3e655adea6 Merge release-20210112.0-45-gbe17b9444 (automated) 2021-01-19 23:22:31 +00:00
Arthur Sfez be17b94446 Per NIC NetworkEndpoint statistics
To facilitate the debugging of multi-homed setup, track Network
protocols statistics for each endpoint. Note that the original
stack-wide stats still exist.

A new type of statistic counter is introduced, which track two
versions of a stat at the same time. This lets a network endpoint
increment both the local stat and the stack-wide stat at the same
time.

Fixes #4605

PiperOrigin-RevId: 352663276
2021-01-19 15:07:39 -08:00
gVisor bot a0340632a1 Merge release-20210112.0-43-ga2ec1932c (automated) 2021-01-19 20:27:01 +00:00
Ghanan Gowripalan a2ec1932c9 Drop CheckLocalAddress from LinkAddressCache
PiperOrigin-RevId: 352623277
2021-01-19 12:08:36 -08:00
gVisor bot cfe357cafa Merge release-20210112.0-41-gf5736fa2b (automated) 2021-01-18 02:27:25 +00:00
Ghanan Gowripalan f5736fa2bf Do not use a stack-wide queue of pending packets
Packets may be pending on link resolution to complete before being sent.
Link resolution is performed for neighbors which are unique to a NIC so
hold link resolution related state under the NIC, not the stack.

Note, this change may result in more queued packets but that is okay as
RFC 4861 section 7.2.2 recommends that the stack maintain a queue of
packets for each neighbor that is waiting for link resolution to
complete, not a fixed limit per stack.

PiperOrigin-RevId: 352322155
2021-01-17 18:14:28 -08:00
gVisor bot 4d0fcb53e5 Merge release-20210112.0-40-gcd75bb163 (automated) 2021-01-16 03:05:57 +00:00
Ghanan Gowripalan cd75bb163f Resolve known link address on route creation
If a Route is being created through a link that requires link address
resolution and a remote address that has a known mapping to a link
address, populate the link address when the route is created.

This removes the need for neighbor/link address caches to perform this
check.

Fixes #5149

PiperOrigin-RevId: 352122401
2021-01-15 18:49:22 -08:00
gVisor bot 1a5ad08a03 Merge release-20210112.0-39-g2814a032b (automated) 2021-01-16 02:31:32 +00:00
Ghanan Gowripalan 2814a032be Support GetLinkAddress with neighborCache
Test: integration_test.TestGetLinkAddress
PiperOrigin-RevId: 352119404
2021-01-15 18:15:26 -08:00
gVisor bot a22726e7b3 Merge release-20210112.0-38-gfd5b52c87 (automated) 2021-01-16 01:04:25 +00:00
Ghanan Gowripalan fd5b52c87f Only pass stack.Route's fields to LinkEndpoints
stack.Route is used to send network packets and resolve link addresses.
A LinkEndpoint does not need to do either of these and only needs the
route's fields at the time of the packet write request.

Since LinkEndpoints only need the route's fields when writing packets,
pass a stack.RouteInfo instead.

PiperOrigin-RevId: 352108405
2021-01-15 16:49:15 -08:00
Tamir Duberstein 12d9790833 Remove count argument from tcpip.Endpoint.Read
The same intent can be specified via the io.Writer.

PiperOrigin-RevId: 352098747
2021-01-15 15:49:15 -08:00
gVisor bot 6cc587a931 Merge release-20201216.0-106-gc49ce8ca8 (automated) 2021-01-14 01:30:11 +00:00
Ghanan Gowripalan c49ce8ca8a Clear neighbor table on NIC down
Note, this includes static entries to match linux's behaviour.

```
  $ ip neigh show dev eth0
  192.168.42.1 lladdr fc:ec:da:70:6e:f9 STALE
  $ sudo ip neigh add 192.168.42.172 lladdr 22:33:44:55:66:77 dev eth0
  $ ip neigh show dev eth0
  192.168.42.1 lladdr fc:ec:da:70:6e:f9 STALE
  192.168.42.172 lladdr 22:33:44:55:66:77 PERMANENT
  $ sudo ifconfig eth0 down
  $ ip neigh show dev eth0
  $ sudo ifconfig eth0 up
  $ ip neigh show dev eth0
```

Test: stack_test.TestClearNeighborCacheOnNICDisable
PiperOrigin-RevId: 351696306
2021-01-13 17:12:29 -08:00
gVisor bot fc9aec0925 Merge release-20201216.0-105-g25b5ec713 (automated) 2021-01-14 00:18:26 +00:00
Ghanan Gowripalan 25b5ec7135 Do not resolve remote link address at transport layer
Link address resolution is performed at the link layer (if required) so
we can defer it from the transport layer. When link resolution is
required, packets will be queued and sent once link resolution
completes. If link resolution fails, the transport layer will receive a
control message indicating that the stack failed to route the packet.

tcpip.Endpoint.Write no longer returns a channel now that writes do not
wait for link resolution at the transport layer.

tcpip.ErrNoLinkAddress is no longer used so it is removed.

Removed calls to stack.Route.ResolveWith from the transport layer so
that link resolution is performed when a route is created in response
to an incoming packet (e.g. to complete TCP handshakes or send a RST).

Tests:
- integration_test.TestForwarding
- integration_test.TestTCPLinkResolutionFailure

Fixes #4458

RELNOTES: n/a
PiperOrigin-RevId: 351684158
2021-01-13 16:04:33 -08:00
gVisor bot 43ca8a82cb Merge release-20201216.0-94-ge74aa25e2 (automated) 2021-01-13 04:47:27 +00:00
Ghanan Gowripalan 62b4c2f517 Drop TransportEndpointID from HandleControlPacket
When a control packet is delivered, it is delivered to a transport
endpoint with a matching stack.TransportEndpointID so there is no
need to pass the ID to the endpoint as it already knows its ID.

PiperOrigin-RevId: 351497588
2021-01-12 19:37:05 -08:00
gVisor bot fbc3a3d984 Merge release-20201216.0-87-g4e03e8754 (automated) 2021-01-12 20:47:44 +00:00
Adin Scannell 4e03e87547 Fix simple mistakes identified by goreportcard.
These are primarily simplification and lint mistakes. However, minor
fixes are also included and tests added where appropriate.

PiperOrigin-RevId: 351425971
2021-01-12 12:38:22 -08:00
gVisor bot e524c21569 Merge release-20201216.0-82-g4c4de6644 (automated) 2021-01-11 21:27:53 +00:00
Ting-Yu Wang b1de1da318 netstack: Refactor tcpip.Endpoint.Read
Read now takes a destination io.Writer, count, options. Keeping the method name
Read, in contrast to the Write method.

This enables:
* direct transfer of views under VV
* zero copy

It also eliminates the need for sentry to keep a slice of view because
userspace had requested a read that is smaller than the view returned, removing
the complexity there.

Read/Peek/ReadPacket are now consolidated together and some duplicate code is
removed.

PiperOrigin-RevId: 350636322
2021-01-07 14:17:18 -08:00
gVisor bot 5c21c7c3bd Merge release-20201208.0-89-g3ff7324df (automated) 2020-12-28 22:05:49 +00:00
Nayana Bidari 7c8ba72b02 Move SO_BINDTODEVICE to socketops.
PiperOrigin-RevId: 348696094
2020-12-22 14:44:02 -08:00
Peter Johnston fee2cd640f Invoke address resolution upon subsequent traffic to Failed neighbor
Removes the period of time in which subseqeuent traffic to a Failed neighbor
immediately fails with ErrNoLinkAddress. A Failed neighbor is one in which
address resolution fails; or in other words, the neighbor's IP address cannot
be translated to a MAC address.

This means removing the Failed state for linkAddrCache and allowing transitiong
out of Failed into Incomplete for neighborCache. Previously, both caches would
transition entries to Failed after address resolution fails. In this state, any
subsequent traffic requested within an unreachable time would immediately fail
with ErrNoLinkAddress. This does not follow RFC 4861 section 7.3.3:

  If address resolution fails, the entry SHOULD be deleted, so that subsequent
  traffic to that neighbor invokes the next-hop determination procedure again.
  Invoking next-hop determination at this point ensures that alternate default
  routers are tried.

The API for getting a link address for a given address, whether through the link
address cache or the neighbor table, is updated to optionally take a callback
which will be called when address resolution completes. This allows `Route` to
handle completing link resolution internally, so callers of (*Route).Resolve
(e.g. endpoints) don’t have to keep track of when it completes and update the
Route accordingly.

This change also removes the wakers from LinkAddressCache, NeighborCache, and
Route in favor of the callbacks, and callers that previously used a waker can
now just pass a callback to (*Route).Resolve that will notify the waker on
resolution completion.

Fixes #4796

Startblock:
  has LGTM from sbalana
  and then
  add reviewer ghanan
PiperOrigin-RevId: 348597478
2020-12-22 01:37:05 -08:00
Ghanan Gowripalan 620de250a4 Prefer matching labels and longest matching prefix
...when performing source address selection for IPv6.

These are defined in RFC 6724 section 5 rule 6 (prefer matching label)
and rule 8 (use longest matching prefix).

This change also considers ULA of global scope instead of its own scope,
as per RFC 6724 section 3.1:

   Also, note that ULAs are considered as global, not
   site-local, scope but are handled via the prefix policy table as
   discussed in Section 10.6.

Test: stack_test.TestIPv6SourceAddressSelectionScope

Startblock:
  has LGTM from peterjohnston
  and then
  add reviewer brunodalbo
PiperOrigin-RevId: 348580996
2020-12-21 22:26:10 -08:00
Tamir Duberstein 4640fc4f35 Remove duplicate `return`
PiperOrigin-RevId: 347974624
2020-12-17 00:40:33 -08:00
Nayana Bidari 0c92b3782a Add support to count the number of packets SACKed.
sacked_out is required in RACK to check the number of duplicate
acknowledgements during updating the reorder window. If there is no reordering
and the value for sacked_out is greater than the classic threshold value 3,
then reorder window is set to zero.
It is calculated by counting the number of segments sacked in the ACK and is
reduced when a cumulative ACK is received which covers the SACK blocks. This
value is set to zero when the connection enters recovery.

PiperOrigin-RevId: 347872246
2020-12-16 12:19:21 -08:00
gVisor bot b0f23fb7e0 Merge release-20201208.0-46-g25ebddbdd (automated) 2020-12-15 17:59:18 +00:00
Ting-Yu Wang 25ebddbddf Fix a data race in packetEPs
packetEPs may get into a state that `len < cap`, casuing append() modifying the
original slice storage.

Reported-by: syzbot+978dd0e9c2600ab7a76b@syzkaller.appspotmail.com
PiperOrigin-RevId: 347634351
2020-12-15 09:55:40 -08:00
Bruno Dal Bo 4aef908c92 Introduce IPv6 extension header serialization facilities
Adds IPv6 extension header serializer and Hop by Hop options serializer.
Add RouterAlert option serializer and use it in MLD.

Fixed #4996

Startblock:
  has LGTM from marinaciocea
  and then
  add reviewer ghanan
PiperOrigin-RevId: 347174537
2020-12-12 09:07:44 -08:00
Ayush Ranjan af4afdc0e0 [netstack] Decouple tcpip.ControlMessages from the IP control messges.
tcpip.ControlMessages can not contain Linux specific structures which makes it
painful to convert back and forth from Linux to tcpip back to Linux when passing
around control messages in hostinet and raw sockets.

Now we convert to the Linux version of the control message as soon as we are
out of tcpip.

PiperOrigin-RevId: 347027065
2020-12-11 10:33:58 -08:00
gVisor bot ecaee47bf9 Merge release-20201130.0-80-g53a95ad0d (automated) 2020-12-10 22:56:34 +00:00
Ghanan Gowripalan 53a95ad0df Use specified source address for IGMP/MLD packets
This change also considers interfaces and network endpoints enabled up
up to the point all work to disable them are complete. This was needed
so that protocols can perform shutdown work while being disabled (e.g.
sending a packet which requires the endpoint to be enabled to obtain a
source address).

Bug #4682, #4861
Fixes #4888

Startblock:
  has LGTM from peterjohnston
  and then
  add reviewer brunodalbo
PiperOrigin-RevId: 346869702
2020-12-10 14:50:20 -08:00
gVisor bot 8299d30640 Merge release-20201130.0-46-gdf2dbe3e3 (automated) 2020-12-05 06:07:30 +00:00
Ghanan Gowripalan df2dbe3e38 Remove stack.ReadOnlyAddressableEndpointState
Startblock:
  has LGTM from asfez
  and then
  add reviewer tamird
PiperOrigin-RevId: 345815146
2020-12-04 22:04:23 -08:00
gVisor bot 588cab496f Merge release-20201130.0-39-gfd28ccfaa (automated) 2020-12-04 18:14:07 +00:00
Bruno Dal Bo fd28ccfaa4 Introduce IPv4 options serializer and add RouterAlert to IGMP
PiperOrigin-RevId: 345701623
2020-12-04 10:10:56 -08:00
gVisor bot fa51e6c93b Merge release-20201130.0-31-g3ff1aef54 (automated) 2020-12-03 16:57:47 +00:00
Peter Johnston 3ff1aef544 Make `stack.Route` thread safe
Currently we rely on the user to take the lock on the endpoint that owns the
route, in order to modify it safely. We can instead move
`Route.RemoteLinkAddress` under `Route`'s mutex, and allow non-locking and
thread-safe access to other fields of `Route`.

PiperOrigin-RevId: 345461586
2020-12-03 08:54:24 -08:00
Adin Scannell 80552b936d Support partitions for other tests.
PiperOrigin-RevId: 345399936
2020-12-03 01:00:21 -08:00
Arthur Sfez bdaae08ee2 Extract ICMPv4/v6 specific stats to their own types
This change lets us split the v4 stats from the v6 stats, which will be
useful when adding stats for each network endpoint.

PiperOrigin-RevId: 345322615
2020-12-02 15:17:20 -08:00
Ghanan Gowripalan 41675ebc63 Deflake stack_test.TestRouterSolicitation
...by using the fake clock.

TestRouterSolicitation no longer runs its sub-tests in parallel now that
the sub-tests are not long-running - the fake clock simulates time
moving forward.

PiperOrigin-RevId: 345165794
2020-12-01 21:58:28 -08:00
gVisor bot c00b31f29b Merge release-20201117.0-88-g0c4973942 (automated) 2020-12-02 05:38:56 +00:00
Ghanan Gowripalan 0c49739422 Correctly lock when listing neighbor entries
PiperOrigin-RevId: 345162450
2020-12-01 21:34:52 -08:00
gVisor bot 0f56168db9 Merge release-20201117.0-84-g25570ac4f (automated) 2020-12-01 16:04:16 +00:00
Ghanan Gowripalan 25570ac4f3 Track join count in multicast group protocol state
Before this change, the join count and the state for IGMP/MLD was held
across different types which required multiple locks to be held when
accessing a multicast group's state.

Bug #4682, #4861
Fixes #4916

PiperOrigin-RevId: 345019091
2020-12-01 07:52:40 -08:00
gVisor bot e236d392b1 Merge release-20201117.0-78-ge81300866 (automated) 2020-11-30 22:27:48 +00:00
Ghanan Gowripalan e813008664 Perform IGMP/MLD when the NIC is enabled/disabled
Test: ip_test.TestMGPWithNICLifecycle

Bug #4682, #4861

PiperOrigin-RevId: 344888091
2020-11-30 14:24:47 -08:00
Ayush Ranjan ad83112423 [netstack] Add SOL_TCP options to SocketOptions.
Ports the following options:
- TCP_NODELAY
- TCP_CORK
- TCP_QUICKACK

Also deletes the {Get/Set}SockOptBool interface methods from all implementations

PiperOrigin-RevId: 344378824
2020-11-26 00:43:13 -08:00
Ayush Ranjan bebadb5182 [netstack] Add SOL_IP and SOL_IPV6 options to SocketOptions.
We will use SocketOptions for all kinds of options, not just SOL_SOCKET options
because (1) it is consistent with Linux which defines all option variables on
the top level socket struct, (2) avoid code complexity. Appropriate checks
have been added for matching option level to the endpoint type.

Ported the following options to this new utility:
- IP_MULTICAST_LOOP
- IP_RECVTOS
- IPV6_RECVTCLASS
- IP_PKTINFO
- IP_HDRINCL
- IPV6_V6ONLY

Changes in behavior (these are consistent with what Linux does AFAICT):
- Now IP_MULTICAST_LOOP can be set for TCP (earlier it was a noop) but does not
  affect the endpoint itself.
- We can now getsockopt IP_HDRINCL (earlier we would get an error).
- Now we return ErrUnknownProtocolOption if SOL_IP or SOL_IPV6 options are used
  on unix sockets.
- Now we return ErrUnknownProtocolOption if SOL_IPV6 options are used on non
  AF_INET6 endpoints.

This change additionally makes the following modifications:
- Add State() uint32 to commonEndpoint because both tcpip.Endpoint and
  transport.Endpoint interfaces have it. It proves to be quite useful.
- Gets rid of SocketOptionsHandler.IsListening(). It was an anomaly as it was
  not a handler. It is now implemented on netstack itself.
- Gets rid of tcp.endpoint.EndpointInfo and directly embeds
  stack.TransportEndpointInfo. There was an unnecessary level of embedding
  which served no purpose.
- Removes some checks dual_stack_test.go that used the errors from
  GetSockOptBool(tcpip.V6OnlyOption) to confirm some state. This is not
  consistent with the new design and also seemed to be testing the
  implementation instead of behavior.

PiperOrigin-RevId: 344354051
2020-11-25 20:01:10 -08:00
Ghanan Gowripalan bc81fcceda Support listener-side MLDv1
...as defined by RFC 2710. Querier (router)-side MLDv1 is not yet
supported.

The core state machine is shared with IGMPv2.

This is guarded behind a flag (ipv6.Options.MLDEnabled).

Tests: ip_test.TestMGP*

Bug #4861

PiperOrigin-RevId: 344344095
2020-11-25 18:00:41 -08:00
gVisor bot 2442e44c4b Merge release-20201109.0-117-g2485a4e2c (automated) 2020-11-25 22:56:14 +00:00
Ghanan Gowripalan 2485a4e2cb Make stack.Route safe to access concurrently
Multiple goroutines may use the same stack.Route concurrently so
the stack.Route should make sure that any functions called on it
are thread-safe.

Fixes #4073

PiperOrigin-RevId: 344320491
2020-11-25 14:52:59 -08:00
gVisor bot 7e4b4bcc8c Merge release-20201109.0-114-g99f2d0ea2 (automated) 2020-11-24 23:40:55 +00:00
Sam Balana 99f2d0ea2f Correctly lock when removing neighbor entries
Fix a panic when two entries in Failed state are removed at the same time.

PiperOrigin-RevId: 344143777
2020-11-24 15:37:47 -08:00
gVisor bot 4dc3fa9c2e Merge release-20201109.0-112-gf90ab60a8 (automated) 2020-11-24 22:25:59 +00:00
Sam Balana f90ab60a8a Track number of packets queued to Failed neighbors
Add a NIC-specific neighbor table statistic so we can determine how many
packets have been queued to Failed neighbors, indicating an unhealthy local
network. This change assists us to debug in-field issues where subsequent
traffic to a neighbor fails.

Fixes #4819

PiperOrigin-RevId: 344131119
2020-11-24 14:22:03 -08:00
Ghanan Gowripalan 1de08889df Deduplicate code in ipv6.protocol
PiperOrigin-RevId: 344009602
2020-11-24 01:19:42 -08:00
gVisor bot f6c627bdbc Merge release-20201109.0-95-gfbc4a8dbd (automated) 2020-11-20 02:18:35 +00:00
Ryan Heacock fbc4a8dbd1 Perform IGMPv2 when joining IPv4 multicast groups
Added headers, stats, checksum parsing capabilities from RFC 2236 describing
IGMPv2.

IGMPv2 state machine is implemented for each condition, sending and receiving
IGMP Membership Reports and Leave Group messages with backwards compatibility
with IGMPv1 routers.

Test:
* Implemented igmp header parser and checksum calculator in header/igmp_test.go
* ipv4/igmp_test.go tests incoming and outgoing IGMP messages and pathways.
* Added unit test coverage for IGMPv2 RFC behavior + IGMPv1 backwards
   compatibility in ipv4/igmp_test.go.

Fixes #4682

PiperOrigin-RevId: 343408809
2020-11-19 18:15:25 -08:00
gVisor bot f733d88998 Merge release-20201109.0-90-g209a95a35 (automated) 2020-11-19 23:14:35 +00:00
Fabricio Voznika 209a95a35a Propagate IP address prefix from host to netstack
Closes #4022

PiperOrigin-RevId: 343378647
2020-11-19 15:11:17 -08:00
gVisor bot 7dcd014bcf Merge release-20201109.0-88-g27ee4fe76 (automated) 2020-11-19 19:51:34 +00:00
Ghanan Gowripalan 27ee4fe76a Don't hold AddressEndpoints for multicast addresses
Group addressable endpoints can simply check if it has joined the
multicast group without maintaining address endpoints. This also
helps remove the dependency on AddressableEndpoint from
GroupAddressableEndpoint.

Now that group addresses are not tracked with address endpoints, we can
avoid accidentally obtaining a route with a multicast local address.

PiperOrigin-RevId: 343336912
2020-11-19 11:48:15 -08:00
gVisor bot 3f09108ecd Merge release-20201109.0-83-g93750a600 (automated) 2020-11-19 04:25:26 +00:00
Ghanan Gowripalan 93750a600b Remove unused methods from stack.Route
PiperOrigin-RevId: 343211553
2020-11-18 20:22:20 -08:00
Ayush Ranjan df37babd57 [netstack] Move SO_REUSEPORT and SO_REUSEADDR option to SocketOptions.
This changes also introduces:
- `SocketOptionsHandler` interface which can be implemented by endpoints to
  handle endpoint specific behavior on SetSockOpt. This is analogous to what
  Linux does.
- `DefaultSocketOptionsHandler` which is a default implementation of the above.
  This is embedded in all endpoints so that we don't have to uselessly
  implement empty functions. Endpoints with specific behavior can override the
  embedded method by manually defining its own implementation.

PiperOrigin-RevId: 343158301
2020-11-18 14:36:41 -08:00
gVisor bot 98281c81d8 Merge release-20201109.0-75-g60b97bfda (automated) 2020-11-18 20:49:30 +00:00
Ghanan Gowripalan 60b97bfda6 Fix loopback subnet routing error
Packets should be properly routed when sending packets to addresses
in the loopback subnet which are not explicitly assigned to the loopback
interface.

Tests:
- integration_test.TestLoopbackAcceptAllInSubnetUDP
- integration_test.TestLoopbackAcceptAllInSubnetTCP
PiperOrigin-RevId: 343135643
2020-11-18 12:45:57 -08:00
gVisor bot 64b61fd1d2 Merge release-20201109.0-69-g9d148627f (automated) 2020-11-18 15:09:28 +00:00
Bruno Dal Bo 9d148627f8 Introduce stack.WritePacketToRemote, remove LinkEndpoint.WriteRawPacket
Redefine stack.WritePacket into stack.WritePacketToRemote which lets the NIC
decide whether to append link headers.

PiperOrigin-RevId: 343071742
2020-11-18 07:05:59 -08:00
gVisor bot 06e77263c3 Merge release-20201109.0-51-gcc5cfce4c (automated) 2020-11-16 22:39:16 +00:00
Ghanan Gowripalan cc5cfce4c6 Remove ARP address workaround
- Make AddressableEndpoint optional for NetworkEndpoint.
Not all NetworkEndpoints need to support addressing (e.g. ARP), so
AddressableEndpoint should only be implemented for protocols that
support addressing such as IPv4 and IPv6.

With this change, tcpip.ErrNotSupported will be returned by the stack
when attempting to modify addresses on a network endpoint that does
not support addressing.

Now that packets are fully handled at the network layer, and (with this
change) addresses are optional for network endpoints, we no longer need
the workaround for ARP where a fake ARP address was added to each NIC
that performs ARP so that packets would be delivered to the ARP layer.

PiperOrigin-RevId: 342722547
2020-11-16 14:36:10 -08:00
gVisor bot b21b9a28dc Merge release-20201030.0-92-g839dd9700 (automated) 2020-11-13 22:02:57 +00:00
Nayana Bidari 839dd97008 RACK: Detect DSACK
Detect if the ACK is a duplicate and update in RACK.

PiperOrigin-RevId: 342332569
2020-11-13 13:59:43 -08:00
Nayana Bidari 5bb64ce1b8 Refactor SOL_SOCKET options
Store all the socket level options in a struct and call {Get/Set}SockOpt on
this struct. This will avoid implementing socket level options on all
endpoints. This CL contains implementing one socket level option for tcp and
udp endpoints.

PiperOrigin-RevId: 342203981
2020-11-12 22:57:00 -08:00
gVisor bot 77f5e9c854 Merge release-20201030.0-80-g638d64c63 (automated) 2020-11-13 02:42:15 +00:00
Julian Elischer 638d64c633 Change AllocationSize to SizeWithPadding as requested
RELNOTES: n/a
PiperOrigin-RevId: 342176296
2020-11-12 18:38:43 -08:00
gVisor bot a247d2f0e8 Merge release-20201030.0-74-g1a972411b (automated) 2020-11-13 01:36:38 +00:00
Ghanan Gowripalan 1a972411b3 Move packet handling to NetworkEndpoint
The NIC should not hold network-layer state or logic - network packet
handling/forwarding should be performed at the network layer instead
of the NIC.

Fixes #4688

PiperOrigin-RevId: 342166985
2020-11-12 17:33:21 -08:00
gVisor bot 60cccae0c7 Merge release-20201030.0-68-g9c4102896 (automated) 2020-11-11 19:02:45 +00:00
Julian Elischer 9c4102896d Teach netstack how to add options to IPv4 packets
Most packets don't have options but they are an integral part of the
standard. Teaching the ipv4 code how to handle them will simplify future
testing and use.  Because Options are so rare it is worth making sure
that the extra work is kept out of the fast path as much as possible.

Prior to this change, all usages of the IHL field of the IPv4Fields/Encode
system set it to the same constant value except in a couple of tests
for bad values. From this change IHL will not be a constant as it will
depend on the size of any Options. Since ipv4.Encode() now handles the
options it becomes a possible source of errors to let the callers set
this value, so remove it entirely and calculate the value from the size
of the Options if present (or not) therefore guaranteeing a correct value.

Fixes #4709
RELNOTES: n/a
PiperOrigin-RevId: 341864765
2020-11-11 10:59:35 -08:00
gVisor bot 66c2a24303 Merge release-20201030.0-59-g2fcca60a7 (automated) 2020-11-09 22:01:25 +00:00
Andrei Vagin 2fcca60a7b net: connect to the ipv4 localhost returns ENETUNREACH if the address isn't set
cl/340002915 modified the code to return EADDRNOTAVAIL if connect
is called for a localhost address which isn't set.

But actually, Linux returns EADDRNOTAVAIL for ipv6 addresses and ENETUNREACH
for ipv4 addresses.

Updates #4735

PiperOrigin-RevId: 341479129
2020-11-09 13:57:51 -08:00
gVisor bot bcebd1a3ae Merge release-20201030.0-38-g06e33cd73 (automated) 2020-11-06 03:18:50 +00:00
Bhasker Hariharan 06e33cd737 Cache addressEndpoint.addr.Subnet() to avoid allocations.
This change adds a Subnet() method to AddressableEndpoint so that we
can avoid repeated calls to AddressableEndpoint.AddressWithPrefix().Subnet().

Updates #231

PiperOrigin-RevId: 340969877
2020-11-05 19:12:09 -08:00
gVisor bot 1e0cc55833 Merge release-20201030.0-34-g8c0701462 (automated) 2020-11-06 00:01:49 +00:00
Ghanan Gowripalan 8c0701462a Use stack.Route exclusively for writing packets
* Remove stack.Route from incoming packet path.
There is no need to pass around a stack.Route during the incoming path
of a packet. Instead, pass around the packet's link/network layer
information in the packet buffer since all layers may need this
information.

* Support address bound and outgoing packet NIC in routes.
When forwarding is enabled, the source address of a packet may be bound
to a different interface than the outgoing interface. This change
updates stack.Route to hold both NICs so that one can be used to write
packets while the other is used to check if the route's bound address
is valid. Note, we need to hold the address's interface so we can check
if the address is a spoofed address.

* Introduce the concept of a local route.
Local routes are routes where the packet never needs to leave the stack;
the destination is stack-local. We can now route between interfaces
within a stack if the packet never needs to leave the stack, even when
forwarding is disabled.

* Always obtain a route from the stack before sending a packet.
If a packet needs to be sent in response to an incoming packet, a route
must be obtained from the stack to ensure the stack is configured to
send packets to the packet's source from the packet's destination.

* Enable spoofing if a stack may send packets from unowned addresses.
This change required changes to some netgophers since previously,
promiscuous mode was enough to let the netstack respond to all
incoming packets regardless of the packet's destination address. Now
that a stack.Route is not held for each incoming packet, finding a route
may fail with local addresses we don't own but accepted packets for
while in promiscuous mode. Since we also want to be able to send from
any address (in response the received promiscuous mode packets), we need
to enable spoofing.

* Skip transport layer checksum checks for locally generated packets.
If a packet is locally generated, the stack can safely assume that no
errors were introduced while being locally routed since the packet is
never sent out the wire.

Some bugs fixed:
- transport layer checksum was never calculated after NAT.
- handleLocal didn't handle routing across interfaces.
- stack didn't support forwarding across interfaces.
- always consult the routing table before creating an endpoint.

Updates #4688
Fixes #3906

PiperOrigin-RevId: 340943442
2020-11-05 15:52:16 -08:00
gVisor bot ae4df8d6c4 Merge release-20201027.0-54-gc22067d3d (automated) 2020-11-03 03:24:20 +00:00
Sam Balana c22067d3d4 Send NUD probes in a separate gorountine
Send NUD probes in another gorountine to free the thread of execution for
finishing the state transition. This is necessary to avoid deadlock where
sending and processing probes are done in the same call stack, such as loopback
and integration tests.

Fixes #4701

PiperOrigin-RevId: 340362481
2020-11-02 19:20:47 -08:00
gVisor bot 2929b6f1f8 Merge release-20201019.0-115-gdf88f223b (automated) 2020-10-31 08:22:56 +00:00
Andrei Vagin df88f223bb net/tcpip: connect to unset loopback address has to return EADDRNOTAVAIL
In the docker container, the ipv6 loopback address is not set,
and connect("::1") has to return ENEADDRNOTAVAIL in this case.

Without this fix, it returns EHOSTUNREACH.

PiperOrigin-RevId: 340002915
2020-10-31 01:19:40 -07:00
gVisor bot e0bd5098f6 Merge release-20201019.0-112-gba05c6845 (automated) 2020-10-30 22:03:24 +00:00
Dean Deng ba05c6845d Automated rollback of changelist 339750876
PiperOrigin-RevId: 339945377
2020-10-30 14:58:43 -07:00
gVisor bot a862d297fd Merge release-20201019.0-104-ga86f988a8 (automated) 2020-10-29 21:52:51 +00:00
Dean Deng a86f988a87 Automated rollback of changelist 339675182
PiperOrigin-RevId: 339750876
2020-10-29 14:48:08 -07:00
gVisor bot 8f9a789489 Merge release-20201019.0-103-g181fea0b5 (automated) 2020-10-29 21:38:24 +00:00
Kevin Krakauer 181fea0b58 Make RedirectTarget thread safe
Fixes #4613.

PiperOrigin-RevId: 339746784
2020-10-29 14:28:56 -07:00
gVisor bot 322b3f8d1e Merge release-20201019.0-101-g02fe467b4 (automated) 2020-10-29 19:34:29 +00:00
Kevin Krakauer 02fe467b47 Keep magic constants out of netstack
PiperOrigin-RevId: 339721152
2020-10-29 12:22:21 -07:00
gVisor bot b67450538c Merge release-20201019.0-99-g1f0f687cb (automated) 2020-10-29 15:50:56 +00:00
Dean Deng 1f0f687cbe Delay goroutine creation during TCP handshake for accept/connect.
Refactor TCP handshake code so that when connect is initiated, the initial SYN
is sent before creating a goroutine to handle the rest of the handshake (which
blocks). Similarly, the initial SYN-ACK is sent inline when SYN is received
during accept.

Some additional cleanup is done as well.

Eventually we would like to complete connections in the dispatcher without
requiring a wakeup to complete the handshake. This refactor makes that easier.

Updates #231

PiperOrigin-RevId: 339675182
2020-10-29 08:46:04 -07:00
gVisor bot a70cf7136a Merge release-20201019.0-89-gb26797a8d (automated) 2020-10-28 20:04:54 +00:00
Tamir Duberstein b26797a8d5 Avoid time.Now in NUD
Use the stack clock instead. Change NeighborEntry.UpdatedAt to
UpdatedAtNanos.

PiperOrigin-RevId: 339520566
2020-10-28 13:01:56 -07:00
gVisor bot 6b463fe2b7 Merge release-20201019.0-76-g035b1c827 (automated) 2020-10-28 02:35:11 +00:00
Julian Elischer 035b1c8272 Add support for Timestamp and RecordRoute IP options
IPv4 options extend the size of the IP header and have a basic known
format. The framework can process that format without needing to know
about every possible option. We can add more code to handle additional
option types as we need them. Bad options or mangled option entries
can result in ICMP Parameter Problem packets. The first types we
support are the Timestamp option and the Record Route option, included
in this change.

The options are processed at several points in the packet flow within
the Network stack, with slightly different requirements. The framework
includes a mechanism to control this at each point. Support has been
added for such points which are only present in upcoming CLs such as
during packet forwarding and fragmentation.

With this change, 'ping -R' and 'ping -T' work against gVisor and Fuchsia.

$ ping -R 192.168.1.2
PING 192.168.1.2 (192.168.1.2) 56(124) bytes of data.
64 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=0.990 ms
NOP
RR:     192.168.1.1
        192.168.1.2
        192.168.1.1

$ ping -T tsprespec 192.168.1.2 192.168.1.1 192.168.1.2
PING 192.168.1.2 (192.168.1.2) 56(124) bytes of data.
64 bytes from 192.168.1.2: icmp_seq=1 ttl=64 time=1.20 ms
TS:     192.168.1.2    71486821 absolute
        192.168.1.1    746

Unit tests included for generic options, Timestamp options
and Record Route options.

PiperOrigin-RevId: 339379076
2020-10-27 19:32:09 -07:00
gVisor bot b6aec86b75 Merge release-20201019.0-70-g4d9066d1d (automated) 2020-10-27 22:47:58 +00:00
Tamir Duberstein 4d9066d1d7 Pass NeighborEntry in NUD callbacks
...instead of passing its fields piecemeal.

PiperOrigin-RevId: 339345899
2020-10-27 15:45:06 -07:00
gVisor bot 800a6842c3 Merge release-20201019.0-68-g59e2c9f16 (automated) 2020-10-27 07:21:15 +00:00
Ian Lewis 59e2c9f16a Add basic address deletion to netlink
Updates #3921

PiperOrigin-RevId: 339195417
2020-10-27 00:18:10 -07:00
gVisor bot 6431f29561 Merge release-20201019.0-43-g8db147b55 (automated) 2020-10-23 19:44:13 +00:00
Sam Balana 8db147b554 Wait before transitioning NUD entries from Probe to Failed
Wait an additional RetransmitTimer duration after the last probe before
transitioning to Failed. The previous implementation transitions immediately to
Failed after sending the last probe, which is erroneous behavior.

PiperOrigin-RevId: 338723794
2020-10-23 12:33:12 -07:00
gVisor bot addf7ba238 Merge release-20201019.0-36-gdad08229b (automated) 2020-10-23 17:35:54 +00:00
Ghanan Gowripalan dad08229b8 Do not hold NIC local address in neighbor entries
Previously, the NIC local address used when completing link resolution
was held in the neighbor entry. A neighbor is not identified by any
NIC local address so remove it.

PiperOrigin-RevId: 338699695
2020-10-23 10:31:44 -07:00
gVisor bot 12129e3c1a Merge release-20201019.0-31-gc1a6ba06a (automated) 2020-10-23 00:08:36 +00:00
Ghanan Gowripalan c1a6ba06ab Pass NetworkInterface to LinkAddressRequest
Previously a link endpoint was passed to
stack.LinkAddressResolver.LinkAddressRequest. With this change,
implementations that want a route for the link address request may
find one through the stack. Other implementations that want to send
a packet without a route may continue to do so using the network
interface directly.

Test: - arp_test.TestLinkAddressRequest
      - ipv6.TestLinkAddressRequest
PiperOrigin-RevId: 338577474
2020-10-22 17:02:29 -07:00
gVisor bot 0cf756f5cc Merge release-20201005.0-113-g2bfdbfd1f (automated) 2020-10-20 23:12:15 +00:00
Ghanan Gowripalan 2bfdbfd1fd Fix locking in AddressableEndpointState
PiperOrigin-RevId: 338156438
2020-10-20 16:06:30 -07:00
Ting-Yu Wang 4da10f873e Fix nogo tests.
//pkg/tcpip/stack:stack_x_test_nogo
//pkg/tcpip/transport/raw:raw_nogo

PiperOrigin-RevId: 338153265
2020-10-20 15:47:48 -07:00
gVisor bot 55f093bcb3 Merge release-20201005.0-95-gdffa4c669 (automated) 2020-10-16 20:57:20 +00:00
Ghanan Gowripalan dffa4c6690 Don't include link header when forwarding packets
Before this change, if a link header was included in an incoming packet
that is forwarded, the packet that gets sent out will take the original
packet and add a link header to it while keeping the old link header.
This would make the sent packet look like:

   OUTGOING LINK HDR | INCOMING LINK HDR | NETWORK HDR | ...

Obviously this is incorrect as we should drop the incoming link header
and only include the outgoing link header. This change fixes this bug.

Test: integration_test.TestForwarding
PiperOrigin-RevId: 337571447
2020-10-16 13:54:00 -07:00
gVisor bot 2a411eafcd Merge release-20201005.0-83-g3269cefd6 (automated) 2020-10-15 22:40:25 +00:00
Sam Balana 3269cefd6f Process NAs without target link-layer addresses
RFC 4861 section 4.4 comments the Target link-layer address option is sometimes
optional in a Neighbor Advertisement packet:

  "When responding to a unicast Neighbor Solicitation this option SHOULD be
  included."

Tests:
 pkg/tcpip/stack:stack_test
 - TestEntryStaleToReachableWhenSolicitedConfirmationWithoutAddress
 - TestEntryDelayToReachableWhenSolicitedConfirmationWithoutAddress
 - TestEntryProbeToReachableWhenSolicitedConfirmationWithoutAddress
 pkg/tcpip/network/ipv6:ipv6_test
 - TestCallsToNeighborCache
PiperOrigin-RevId: 337396493
2020-10-15 15:37:01 -07:00
gVisor bot 6057fda878 Merge release-20200928.0-117-g6e6a9d3f3 (automated) 2020-10-14 22:33:03 +00:00
Ghanan Gowripalan 6e6a9d3f3d Find route before sending NA response
This change also brings back the stack.Route.ResolveWith method so that
we can immediately resolve a route when sending an NA in response to a
a NS with a source link layer address option.

Test: ipv6_test.TestNeighorSolicitationResponse
PiperOrigin-RevId: 337185461
2020-10-14 15:29:47 -07:00
gVisor bot 69aa120d40 Merge release-20200928.0-78-g743327817 (automated) 2020-10-09 19:26:05 +00:00
gVisor bot 578aece760 Merge release-20200928.0-77-g257703c05 (automated) 2020-10-09 19:17:35 +00:00
Ghanan Gowripalan 257703c050 Automated rollback of changelist 336304024
PiperOrigin-RevId: 336339194
2020-10-09 12:09:12 -07:00
gVisor bot 48606fbf85 Merge release-20200928.0-74-g8566decab (automated) 2020-10-09 16:14:24 +00:00
Bhasker Hariharan 8566decab0 Automated rollback of changelist 336185457
PiperOrigin-RevId: 336304024
2020-10-09 09:11:18 -07:00
gVisor bot 13c73e720f Merge release-20200928.0-73-g07b1d7413 (automated) 2020-10-09 00:39:26 +00:00
Ghanan Gowripalan 07b1d7413e Only block resolution when NUD is incomplete
When a completed entry exists for a neighbor, there is no need to block
while reachability is (re)confirmed. The stack should continue to use
the neighbor's link address while NUD is performed.

Test: stack_test.TestNeighborCacheReplace
PiperOrigin-RevId: 336199043
2020-10-08 17:34:28 -07:00
gVisor bot 0482839af3 Merge release-20200928.0-71-g6768e6c59 (automated) 2020-10-08 23:23:14 +00:00
Ghanan Gowripalan 6768e6c59e Do not resolve routes immediately
When a response needs to be sent to an incoming packet, the stack should
consult its neighbour table to determine the remote address's link
address.

When an entry does not exist in the stack's neighbor table, the stack
should queue the packet while link resolution completes. See comments.

PiperOrigin-RevId: 336185457
2020-10-08 16:15:59 -07:00
gVisor bot 8f70c8003e Merge release-20200928.0-66-ga55bd73d4 (automated) 2020-10-08 01:32:17 +00:00