Commit Graph

93 Commits

Author SHA1 Message Date
Ghanan Gowripalan c19e049f2c Check local address directly through NIC
Network endpoints that wish to check addresses on another NIC-local
network endpoint may now do so through the NetworkInterface.

This fixes a lock ordering issue between NIC removal and link
resolution. Before this change:

  NIC Removal takes the stack lock, neighbor cache lock then neighbor
  entries' locks.

  When performing IPv4 link resolution, we take the entry lock then ARP
  would try check IPv4 local addresses through the stack which tries to
  obtain the stack's lock.

Now that ARP can check IPv4 addreses through the NIC, we avoid the lock
ordering issue, while also removing the need for stack to lookup the
NIC.

PiperOrigin-RevId: 356034245
2021-02-06 09:09:19 -08:00
Ghanan Gowripalan 24416032ab Refactor locally delivered packets
Make it clear that failing to parse a looped back is not a packet
sending error but a malformed received packet error.

FindNetworkEndpoint returns nil when no network endpoint is found
instead of an error.

PiperOrigin-RevId: 355954946
2021-02-05 16:47:11 -08:00
Ghanan Gowripalan 4ee8cf8734 Use different neighbor tables per network endpoint
This stores each protocol's neighbor state separately.

This change also removes the need for each neighbor entry to keep
track of their own link address resolver now that all the entries
in a cache will use the same resolver.

PiperOrigin-RevId: 354818155
2021-01-31 11:33:46 -08:00
Ghanan Gowripalan daeb06d2cb Hide neighbor table kind from NetworkEndpoint
The network endpoint should not need to have logic to handle different
kinds of neighbor tables. Network endpoints can let the NIC know about
differnt neighbor discovery messages and let the NIC decide which table
to update.

This allows us to remove the LinkAddressCache interface.

PiperOrigin-RevId: 354812584
2021-01-31 10:03:46 -08:00
Ghanan Gowripalan 2d90bc5480 Implement LinkAddressResolver on NetworkEndpoints
This removes the need to provide the link address request with the NIC
the request is being performed on since the NetworkEndpoints already
have a reference to the NIC.

PiperOrigin-RevId: 354721940
2021-01-30 11:37:29 -08:00
Tamir Duberstein 8d1afb4185 Change tcpip.Error to an interface
This makes it possible to add data to types that implement tcpip.Error.
ErrBadLinkEndpoint is removed as it is unused.

PiperOrigin-RevId: 354437314
2021-01-28 17:59:58 -08:00
gVisor bot 102a9b893f Merge pull request #4705 from mlevesquedion:fix-cmp-diff-reporting-in-nud-tests
PiperOrigin-RevId: 354187603
2021-01-27 15:37:20 -08:00
Arthur Sfez 39db3b9355 Add per endpoint ARP statistics
The ARP stat NetworkUnreachable was removed, and was replaced by
InterfaceHasNoLocalAddress. No stats are recorded when dealing with an
missing endpoint (ErrNotConnected) (because if there is no endpoint,
there is no valid per-endpoint stats).

PiperOrigin-RevId: 353759462
2021-01-25 16:52:05 -08:00
Arthur Sfez 18ebec0ec9 Refactor GetMainNICAddress
It previously returned an error but it could only be UnknownNICID. It now
returns a boolean to indicate whether the nic exists or not.

PiperOrigin-RevId: 353337489
2021-01-22 16:12:12 -08:00
Michaël Lévesque-Dion 9ea1a875eb rewrite diff check to match example in cmp.Diff docs 2021-01-20 14:03:35 -05:00
Ghanan Gowripalan 7ff5ceaeae Do not have a stack-wide linkAddressCache
Link addresses are cached on a per NIC basis so instead of having a
single cache that includes the NIC ID for neighbor entry lookups,
use a single cache per NIC.

PiperOrigin-RevId: 352684111
2021-01-19 16:56:49 -08:00
Arthur Sfez be17b94446 Per NIC NetworkEndpoint statistics
To facilitate the debugging of multi-homed setup, track Network
protocols statistics for each endpoint. Note that the original
stack-wide stats still exist.

A new type of statistic counter is introduced, which track two
versions of a stat at the same time. This lets a network endpoint
increment both the local stat and the stack-wide stat at the same
time.

Fixes #4605

PiperOrigin-RevId: 352663276
2021-01-19 15:07:39 -08:00
Ghanan Gowripalan a2ec1932c9 Drop CheckLocalAddress from LinkAddressCache
PiperOrigin-RevId: 352623277
2021-01-19 12:08:36 -08:00
Ghanan Gowripalan fd5b52c87f Only pass stack.Route's fields to LinkEndpoints
stack.Route is used to send network packets and resolve link addresses.
A LinkEndpoint does not need to do either of these and only needs the
route's fields at the time of the packet write request.

Since LinkEndpoints only need the route's fields when writing packets,
pass a stack.RouteInfo instead.

PiperOrigin-RevId: 352108405
2021-01-15 16:49:15 -08:00
Arthur Sfez 833516c139 Add stats for ARP
Fixes #4963

Startblock:
  has LGTM from sbalana
  and then
  add reviewer ghanan
PiperOrigin-RevId: 351886320
2021-01-14 15:16:24 -08:00
Peter Johnston fee2cd640f Invoke address resolution upon subsequent traffic to Failed neighbor
Removes the period of time in which subseqeuent traffic to a Failed neighbor
immediately fails with ErrNoLinkAddress. A Failed neighbor is one in which
address resolution fails; or in other words, the neighbor's IP address cannot
be translated to a MAC address.

This means removing the Failed state for linkAddrCache and allowing transitiong
out of Failed into Incomplete for neighborCache. Previously, both caches would
transition entries to Failed after address resolution fails. In this state, any
subsequent traffic requested within an unreachable time would immediately fail
with ErrNoLinkAddress. This does not follow RFC 4861 section 7.3.3:

  If address resolution fails, the entry SHOULD be deleted, so that subsequent
  traffic to that neighbor invokes the next-hop determination procedure again.
  Invoking next-hop determination at this point ensures that alternate default
  routers are tried.

The API for getting a link address for a given address, whether through the link
address cache or the neighbor table, is updated to optionally take a callback
which will be called when address resolution completes. This allows `Route` to
handle completing link resolution internally, so callers of (*Route).Resolve
(e.g. endpoints) don’t have to keep track of when it completes and update the
Route accordingly.

This change also removes the wakers from LinkAddressCache, NeighborCache, and
Route in favor of the callbacks, and callers that previously used a waker can
now just pass a callback to (*Route).Resolve that will notify the waker on
resolution completion.

Fixes #4796

Startblock:
  has LGTM from sbalana
  and then
  add reviewer ghanan
PiperOrigin-RevId: 348597478
2020-12-22 01:37:05 -08:00
Peter Johnston 3ff1aef544 Make `stack.Route` thread safe
Currently we rely on the user to take the lock on the endpoint that owns the
route, in order to modify it safely. We can instead move
`Route.RemoteLinkAddress` under `Route`'s mutex, and allow non-locking and
thread-safe access to other fields of `Route`.

PiperOrigin-RevId: 345461586
2020-12-03 08:54:24 -08:00
Ting-Yu Wang e8df1ccef9 Fix some code not using NewPacketBuffer for creating a PacketBuffer.
PiperOrigin-RevId: 343299993
2020-11-19 08:56:58 -08:00
Ghanan Gowripalan cc5cfce4c6 Remove ARP address workaround
- Make AddressableEndpoint optional for NetworkEndpoint.
Not all NetworkEndpoints need to support addressing (e.g. ARP), so
AddressableEndpoint should only be implemented for protocols that
support addressing such as IPv4 and IPv6.

With this change, tcpip.ErrNotSupported will be returned by the stack
when attempting to modify addresses on a network endpoint that does
not support addressing.

Now that packets are fully handled at the network layer, and (with this
change) addresses are optional for network endpoints, we no longer need
the workaround for ARP where a fake ARP address was added to each NIC
that performs ARP so that packets would be delivered to the ARP layer.

PiperOrigin-RevId: 342722547
2020-11-16 14:36:10 -08:00
Ghanan Gowripalan 1a972411b3 Move packet handling to NetworkEndpoint
The NIC should not hold network-layer state or logic - network packet
handling/forwarding should be performed at the network layer instead
of the NIC.

Fixes #4688

PiperOrigin-RevId: 342166985
2020-11-12 17:33:21 -08:00
Ghanan Gowripalan 8c0701462a Use stack.Route exclusively for writing packets
* Remove stack.Route from incoming packet path.
There is no need to pass around a stack.Route during the incoming path
of a packet. Instead, pass around the packet's link/network layer
information in the packet buffer since all layers may need this
information.

* Support address bound and outgoing packet NIC in routes.
When forwarding is enabled, the source address of a packet may be bound
to a different interface than the outgoing interface. This change
updates stack.Route to hold both NICs so that one can be used to write
packets while the other is used to check if the route's bound address
is valid. Note, we need to hold the address's interface so we can check
if the address is a spoofed address.

* Introduce the concept of a local route.
Local routes are routes where the packet never needs to leave the stack;
the destination is stack-local. We can now route between interfaces
within a stack if the packet never needs to leave the stack, even when
forwarding is disabled.

* Always obtain a route from the stack before sending a packet.
If a packet needs to be sent in response to an incoming packet, a route
must be obtained from the stack to ensure the stack is configured to
send packets to the packet's source from the packet's destination.

* Enable spoofing if a stack may send packets from unowned addresses.
This change required changes to some netgophers since previously,
promiscuous mode was enough to let the netstack respond to all
incoming packets regardless of the packet's destination address. Now
that a stack.Route is not held for each incoming packet, finding a route
may fail with local addresses we don't own but accepted packets for
while in promiscuous mode. Since we also want to be able to send from
any address (in response the received promiscuous mode packets), we need
to enable spoofing.

* Skip transport layer checksum checks for locally generated packets.
If a packet is locally generated, the stack can safely assume that no
errors were introduced while being locally routed since the packet is
never sent out the wire.

Some bugs fixed:
- transport layer checksum was never calculated after NAT.
- handleLocal didn't handle routing across interfaces.
- stack didn't support forwarding across interfaces.
- always consult the routing table before creating an endpoint.

Updates #4688
Fixes #3906

PiperOrigin-RevId: 340943442
2020-11-05 15:52:16 -08:00
Tamir Duberstein b26797a8d5 Avoid time.Now in NUD
Use the stack clock instead. Change NeighborEntry.UpdatedAt to
UpdatedAtNanos.

PiperOrigin-RevId: 339520566
2020-10-28 13:01:56 -07:00
Tamir Duberstein 4d9066d1d7 Pass NeighborEntry in NUD callbacks
...instead of passing its fields piecemeal.

PiperOrigin-RevId: 339345899
2020-10-27 15:45:06 -07:00
Ghanan Gowripalan dad08229b8 Do not hold NIC local address in neighbor entries
Previously, the NIC local address used when completing link resolution
was held in the neighbor entry. A neighbor is not identified by any
NIC local address so remove it.

PiperOrigin-RevId: 338699695
2020-10-23 10:31:44 -07:00
Ghanan Gowripalan c1a6ba06ab Pass NetworkInterface to LinkAddressRequest
Previously a link endpoint was passed to
stack.LinkAddressResolver.LinkAddressRequest. With this change,
implementations that want a route for the link address request may
find one through the stack. Other implementations that want to send
a packet without a route may continue to do so using the network
interface directly.

Test: - arp_test.TestLinkAddressRequest
      - ipv6.TestLinkAddressRequest
PiperOrigin-RevId: 338577474
2020-10-22 17:02:29 -07:00
Ghanan Gowripalan 257703c050 Automated rollback of changelist 336304024
PiperOrigin-RevId: 336339194
2020-10-09 12:09:12 -07:00
Bhasker Hariharan 8566decab0 Automated rollback of changelist 336185457
PiperOrigin-RevId: 336304024
2020-10-09 09:11:18 -07:00
Ghanan Gowripalan 6768e6c59e Do not resolve routes immediately
When a response needs to be sent to an incoming packet, the stack should
consult its neighbour table to determine the remote address's link
address.

When an entry does not exist in the stack's neighbor table, the stack
should queue the packet while link resolution completes. See comments.

PiperOrigin-RevId: 336185457
2020-10-08 16:15:59 -07:00
Ghanan Gowripalan 5075d0342f Trim Network/Transport Endpoint/Protocol
* Remove Capabilities and NICID methods from NetworkEndpoint.

* Remove linkEP and stack parameters from NetworkProtocol.NewEndpoint.
The LinkEndpoint can be fetched from the NetworkInterface. The stack
is passed to the NetworkProtocol when it is created so the
NetworkEndpoint can get it from its protocol.

* Remove stack parameter from TransportProtocol.NewEndpoint.
Like the NetworkProtocol/Endpoint, the stack is passed to the
TransportProtocol when it is created.

PiperOrigin-RevId: 334332721
2020-09-29 02:05:50 -07:00
Ghanan Gowripalan 48915bdedb Move IP state from NIC to NetworkEndpoint/Protocol
* Add network address to network endpoints.
Hold network-specific state in the NetworkEndpoint instead of the stack.
This results in the stack no longer needing to "know" about the network
endpoints and special case certain work for various endpoints
(e.g. IPv6 DAD).

* Provide NetworkEndpoints with an NetworkInterface interface.
Instead of just passing the NIC ID of a NIC, pass an interface so the
network endpoint may query other information about the NIC such as
whether or not it is a loopback device.

* Move NDP code and state to the IPv6 package.
NDP is IPv6 specific so there is no need for it to live in the stack.

* Control forwarding through NetworkProtocols instead of Stack
Forwarding should be controlled on a per-network protocol basis so
forwarding configurations are now controlled through network protocols.

* Remove stack.referencedNetworkEndpoint.
Now that addresses are exposed via AddressEndpoint and only one
NetworkEndpoint is created per interface, there is no need for a
referenced NetworkEndpoint.

* Assume network teardown methods are infallible.

Fixes #3871, #3916

PiperOrigin-RevId: 334319433
2020-09-29 00:20:41 -07:00
Ghanan Gowripalan a5acc0616c Support creating protocol instances with Stack ref
Network or transport protocols may want to reach the stack. Support this
by letting the stack create the protocol instances so it can pass a
reference to itself at protocol creation time.

Note, protocols do not yet use the stack in this CL but later CLs will
make use of the stack from protocols.

PiperOrigin-RevId: 334260210
2020-09-28 16:24:04 -07:00
Ghanan Gowripalan a376a0baf3 Remove generic ICMP errors
Generic ICMP errors were required because the transport dispatcher was
given the responsibility of sending ICMP errors in response to transport
packet delivery failures. Instead, the transport dispatcher should let
network layer know it failed to deliver a packet (and why) and let the
network layer make the decision as to what error to send (if any).

Fixes #4068

PiperOrigin-RevId: 333962333
2020-09-26 19:24:41 -07:00
Julian Elischer 99decaadd6 Extract ICMP error sender from UDP
Store transport protocol number on packet buffers for use in ICMP error
generation.

Updates #2211.

PiperOrigin-RevId: 333252762
2020-09-23 02:28:43 -07:00
Ghanan Gowripalan 360006d894 Use common parsing utilities when sniffing
Extract parsing utilities so they can be used by the sniffer.

Fixes #3930

PiperOrigin-RevId: 332401880
2020-09-18 00:48:09 -07:00
Ghanan Gowripalan bdd5996a73 Improve type safety for network protocol options
The existing implementation for NetworkProtocol.{Set}Option take
arguments of an empty interface type which all types (implicitly)
implement; any type may be passed to the functions.

This change introduces marker interfaces for network protocol options
that may be set or queried which network protocol option types implement
to ensure that invalid types are caught at compile time. Different
interfaces are used to allow the compiler to enforce read-only or
set-only socket options.

PiperOrigin-RevId: 328980359
2020-08-28 11:50:17 -07:00
Sam Balana a174aa7597 Add option to replace linkAddrCache with neighborCache
This change adds an option to replace the current implementation of ARP through
linkAddrCache, with an implementation of NUD through neighborCache. Switching
to using NUD for both ARP and NDP is beneficial for the reasons described by
RFC 4861 Section 3.1:

  "[Using NUD] significantly improves the robustness of packet delivery in the
  presence of failing routers, partially failing or partitioned links, or nodes
  that change their link-layer addresses. For instance, mobile nodes can move
  off-link without losing any connectivity due to stale ARP caches."

  "Unlike ARP, Neighbor Unreachability Detection detects half-link failures and
  avoids sending traffic to neighbors with which two-way connectivity is
  absent."

Along with these changes exposes the API for querying and operating the
neighbor cache. Operations include:
  - Create a static entry
  - List all entries
  - Delete all entries
  - Remove an entry by address

This also exposes the API to change the NUD protocol constants on a per-NIC
basis to allow Neighbor Discovery to operate over links with widely varying
performance characteristics. See [RFC 4861 Section 10][1] for the list of
constants.

Finally, an API for subscribing to NUD state changes is exposed through
NUDDispatcher. See [RFC 4861 Appendix C][3] for the list of edges.

Tests:
 pkg/tcpip/network/arp:arp_test
 + TestDirectRequest

 pkg/tcpip/network/ipv6:ipv6_test
 + TestLinkResolution
 + TestNDPValidation
 + TestNeighorAdvertisementWithTargetLinkLayerOption
 + TestNeighorSolicitationResponse
 + TestNeighorSolicitationWithSourceLinkLayerOption
 + TestRouterAdvertValidation

 pkg/tcpip/stack:stack_test
 + TestCacheWaker
 + TestForwardingWithFakeResolver
 + TestForwardingWithFakeResolverManyPackets
 + TestForwardingWithFakeResolverManyResolutions
 + TestForwardingWithFakeResolverPartialTimeout
 + TestForwardingWithFakeResolverTwoPackets
 + TestIPv6SourceAddressSelectionScopeAndSameAddress

[1]: https://tools.ietf.org/html/rfc4861#section-10
[2]: https://tools.ietf.org/html/rfc4861#appendix-C

Fixes #1889
Fixes #1894
Fixes #1895
Fixes #1947
Fixes #1948
Fixes #1949
Fixes #1950

PiperOrigin-RevId: 328365034
2020-08-25 11:09:33 -07:00
Ghanan Gowripalan 1736b2208f Use a single NetworkEndpoint per NIC per protocol
The NetworkEndpoint does not need to be created for each address.
Most of the work the NetworkEndpoint does is address agnostic.

PiperOrigin-RevId: 326759605
2020-08-14 17:30:01 -07:00
Ting-Yu Wang 47515f4751 Migrate to PacketHeader API for PacketBuffer.
Formerly, when a packet is constructed or parsed, all headers are set by the
client code. This almost always involved prepending to pk.Header buffer or
trimming pk.Data portion. This is known to prone to bugs, due to the complexity
and number of the invariants assumed across netstack to maintain.

In the new PacketHeader API, client will call Push()/Consume() method to
construct/parse an outgoing/incoming packet. All invariants, such as slicing
and trimming, are maintained by the API itself.

NewPacketBuffer() is introduced to create new PacketBuffer. Zero value is no
longer valid.

PacketBuffer now assumes the packet is a concatenation of following portions:
* LinkHeader
* NetworkHeader
* TransportHeader
* Data
Any of them could be empty, or zero-length.

PiperOrigin-RevId: 326507688
2020-08-13 13:08:57 -07:00
Sam Balana 8dbf428a12 Add ability to send unicast ARP requests and Neighbor Solicitations
The previous implementation of LinkAddressRequest only supported sending
broadcast ARP requests and multicast Neighbor Solicitations. The ability to
send these packets as unicast is required for Neighbor Unreachability
Detection.

Tests:
 pkg/tcpip/network/arp:arp_test
 - TestLinkAddressRequest

 pkg/tcpip/network/ipv6:ipv6_test
 - TestLinkAddressRequest

Updates #1889
Updates #1894
Updates #1895
Updates #1947
Updates #1948
Updates #1949
Updates #1950

PiperOrigin-RevId: 323451569
2020-07-27 15:21:17 -07:00
Ghanan Gowripalan c66991ad7d Add ethernet broadcast address constant
PiperOrigin-RevId: 321620517
2020-07-16 12:26:41 -07:00
Kevin Krakauer 32b823fcdb netstack: parse incoming packet headers up-front
Netstack has traditionally parsed headers on-demand as a packet moves up the
stack. This is conceptually simple and convenient, but incompatible with
iptables, where headers can be inspected and mangled before even a routing
decision is made.

This changes header parsing to happen early in the incoming packet path, as soon
as the NIC gets the packet from a link endpoint. Even if an invalid packet is
found (e.g. a TCP header of insufficient length), the packet is passed up the
stack for proper stats bookkeeping.

PiperOrigin-RevId: 315179302
2020-06-07 13:38:43 -07:00
Ting-Yu Wang d3a8bffe04 Pass PacketBuffer as pointer.
Historically we've been passing PacketBuffer by shallow copying through out
the stack. Right now, this is only correct as the caller would not use
PacketBuffer after passing into the next layer in netstack.

With new buffer management effort in gVisor/netstack, PacketBuffer will
own a Buffer (to be added). Internally, both PacketBuffer and Buffer may
have pointers and shallow copying shouldn't be used.

Updates #2404.

PiperOrigin-RevId: 314610879
2020-06-03 15:00:42 -07:00
Kevin Krakauer 5e1e61fbcb Automated rollback of changelist 308674219
PiperOrigin-RevId: 309491861
2020-05-01 16:09:53 -07:00
Bhasker Hariharan ae15d90436 FIFO QDisc implementation
Updates #231

PiperOrigin-RevId: 309323808
2020-04-30 16:41:00 -07:00
gVisor bot 55f0c3316a Automated rollback of changelist 308163542
PiperOrigin-RevId: 308674219
2020-04-27 12:26:32 -07:00
Kevin Krakauer eccae0f77d Remove View.First() and View.RemoveFirst()
These methods let users eaily break the VectorisedView abstraction, and
allowed netstack to slip into pseudo-enforcement of the "all headers are
in the first View" invariant. Removing them and replacing with PullUp(n)
breaks this reliance and will make it easier to add iptables support and
rework network buffer management.

The new View.PullUp(n) method is low cost in the common case, when when
all the headers fit in the first View.

PiperOrigin-RevId: 308163542
2020-04-23 17:28:49 -07:00
gVisor bot 120d3b50f4 Automated rollback of changelist 307477185
PiperOrigin-RevId: 307598974
2020-04-21 07:16:30 -07:00
Kevin Krakauer a551add5d8 Remove View.First() and View.RemoveFirst()
These methods let users eaily break the VectorisedView abstraction, and
allowed netstack to slip into pseudo-enforcement of the "all headers are
in the first View" invariant. Removing them and replacing with PullUp(n)
breaks this reliance and will make it easier to add iptables support and
rework network buffer management.

The new View.PullUp(n) method is low cost in the common case, when when
all the headers fit in the first View.
2020-04-17 13:25:57 -07:00
Adin Scannell 867eeb18d8 Remove lostcancel warnings.
Updates #2243
2020-04-08 10:14:34 -07:00
Bhasker Hariharan fc99a7ebf0 Refactor software GSO code.
Software GSO implementation currently has a complicated code path with
implicit assumptions that all packets to WritePackets carry same Data
and it does this to avoid allocations on the path etc. But this makes it
hard to reuse the WritePackets API.

This change breaks all such assumptions by introducing a new Vectorised
View API ReadToVV which can be used to cleanly split a VV into multiple
independent VVs. Further this change also makes packet buffers linkable
to form an intrusive list. This allows us to get rid of the array of
packet buffers that are passed in the WritePackets API call and replace
it with a list of packet buffers.

While this code does introduce some more allocations in the benchmarks
it doesn't cause any degradation.

Updates #231

PiperOrigin-RevId: 304731742
2020-04-03 18:35:55 -07:00