Commit Graph

30 Commits

Author SHA1 Message Date
Tamir Duberstein 8d1afb4185 Change tcpip.Error to an interface
This makes it possible to add data to types that implement tcpip.Error.
ErrBadLinkEndpoint is removed as it is unused.

PiperOrigin-RevId: 354437314
2021-01-28 17:59:58 -08:00
Peter Johnston fee2cd640f Invoke address resolution upon subsequent traffic to Failed neighbor
Removes the period of time in which subseqeuent traffic to a Failed neighbor
immediately fails with ErrNoLinkAddress. A Failed neighbor is one in which
address resolution fails; or in other words, the neighbor's IP address cannot
be translated to a MAC address.

This means removing the Failed state for linkAddrCache and allowing transitiong
out of Failed into Incomplete for neighborCache. Previously, both caches would
transition entries to Failed after address resolution fails. In this state, any
subsequent traffic requested within an unreachable time would immediately fail
with ErrNoLinkAddress. This does not follow RFC 4861 section 7.3.3:

  If address resolution fails, the entry SHOULD be deleted, so that subsequent
  traffic to that neighbor invokes the next-hop determination procedure again.
  Invoking next-hop determination at this point ensures that alternate default
  routers are tried.

The API for getting a link address for a given address, whether through the link
address cache or the neighbor table, is updated to optionally take a callback
which will be called when address resolution completes. This allows `Route` to
handle completing link resolution internally, so callers of (*Route).Resolve
(e.g. endpoints) don’t have to keep track of when it completes and update the
Route accordingly.

This change also removes the wakers from LinkAddressCache, NeighborCache, and
Route in favor of the callbacks, and callers that previously used a waker can
now just pass a callback to (*Route).Resolve that will notify the waker on
resolution completion.

Fixes #4796

Startblock:
  has LGTM from sbalana
  and then
  add reviewer ghanan
PiperOrigin-RevId: 348597478
2020-12-22 01:37:05 -08:00
Peter Johnston 3ff1aef544 Make `stack.Route` thread safe
Currently we rely on the user to take the lock on the endpoint that owns the
route, in order to modify it safely. We can instead move
`Route.RemoteLinkAddress` under `Route`'s mutex, and allow non-locking and
thread-safe access to other fields of `Route`.

PiperOrigin-RevId: 345461586
2020-12-03 08:54:24 -08:00
Ghanan Gowripalan cc5cfce4c6 Remove ARP address workaround
- Make AddressableEndpoint optional for NetworkEndpoint.
Not all NetworkEndpoints need to support addressing (e.g. ARP), so
AddressableEndpoint should only be implemented for protocols that
support addressing such as IPv4 and IPv6.

With this change, tcpip.ErrNotSupported will be returned by the stack
when attempting to modify addresses on a network endpoint that does
not support addressing.

Now that packets are fully handled at the network layer, and (with this
change) addresses are optional for network endpoints, we no longer need
the workaround for ARP where a fake ARP address was added to each NIC
that performs ARP so that packets would be delivered to the ARP layer.

PiperOrigin-RevId: 342722547
2020-11-16 14:36:10 -08:00
Dean Deng 0fb5353e45 Initialize references with a value of 1.
This lets us avoid treating a value of 0 as one reference. All references
using the refsvfs2 template must call InitRefs() before the reference is
incremented/decremented, or else a panic will occur. Therefore, it should be
pretty easy to identify missing InitRef calls during testing.

Updates #1486.

PiperOrigin-RevId: 341411151
2020-11-09 08:33:17 -08:00
Dean Deng 9ca66ec598 Rewrite reference leak checker without finalizers.
Our current reference leak checker uses finalizers to verify whether an object
has reached zero references before it is garbage collected. There are multiple
problems with this mechanism, so a rewrite is in order.

With finalizers, there is no way to guarantee that a finalizer will run before
the program exits. When an unreachable object with a finalizer is garbage
collected, its finalizer will be added to a queue and run asynchronously. The
best we can do is run garbage collection upon sandbox exit to make sure that
all finalizers are enqueued.

Furthermore, if there is a chain of finalized objects, e.g. A points to B
points to C, garbage collection needs to run multiple times before all of the
finalizers are enqueued. The first GC run will register the finalizer for A but
not free it. It takes another GC run to free A, at which point B's finalizer
can be registered. As a result, we need to run GC as many times as the length
of the longest such chain to have a somewhat reliable leak checker.

Finally, a cyclical chain of structs pointing to one another will never be
garbage collected if a finalizer is set. This is a well-known issue with Go
finalizers (https://github.com/golang/go/issues/7358). Using leak checking on
filesystem objects that produce cycles will not work and even result in memory
leaks.

The new leak checker stores reference counted objects in a global map when
leak check is enabled and removes them once they are destroyed. At sandbox
exit, any remaining objects in the map are considered as leaked. This provides
a deterministic way of detecting leaks without relying on the complexities of
finalizers and garbage collection.

This approach has several benefits over the former, including:
- Always detects leaks of objects that should be destroyed very close to
  sandbox exit. The old checker very rarely detected these leaks, because it
  relied on garbage collection to be run in a short window of time.
- Panics if we forgot to enable leak check on a ref-counted object (we will try
  to remove it from the map when it is destroyed, but it will never have been
  added).
- Can store extra logging information in the map values without adding to the
  size of the ref count struct itself. With the size of just an int64, the ref
  count object remains compact, meaning frequent operations like IncRef/DecRef
  are more cache-efficient.
- Can aggregate leak results in a single report after the sandbox exits.
  Instead of having warnings littered in the log, which were
  non-deterministically triggered by garbage collection, we can print all
  warning messages at once. Note that this could also be a limitation--the
  sandbox must exit properly for leaks to be detected.

Some basic benchmarking indicates that this change does not significantly
affect performance when leak checking is enabled, which is understandable
since registering/unregistering is only done once for each filesystem object.

Updates #1486.

PiperOrigin-RevId: 338685972
2020-10-23 09:17:02 -07:00
Ghanan Gowripalan 257703c050 Automated rollback of changelist 336304024
PiperOrigin-RevId: 336339194
2020-10-09 12:09:12 -07:00
Bhasker Hariharan 8566decab0 Automated rollback of changelist 336185457
PiperOrigin-RevId: 336304024
2020-10-09 09:11:18 -07:00
Ghanan Gowripalan 6768e6c59e Do not resolve routes immediately
When a response needs to be sent to an incoming packet, the stack should
consult its neighbour table to determine the remote address's link
address.

When an entry does not exist in the stack's neighbor table, the stack
should queue the packet while link resolution completes. See comments.

PiperOrigin-RevId: 336185457
2020-10-08 16:15:59 -07:00
Dean Deng df3c105f49 Use new reference count utility throughout gvisor.
This uses the refs_vfs2 template in vfs2 as well as objects common to vfs1 and
vfs2. Note that vfs1-only refcounts are not replaced, since vfs1 will be deleted
soon anyway.

The following structs now use the new tool, with leak check enabled:
devpts:rootInode
fuse:inode
kernfs:Dentry
kernfs:dir
kernfs:readonlyDir
kernfs:StaticDirectory
proc:fdDirInode
proc:fdInfoDirInode
proc:subtasksInode
proc:taskInode
proc:tasksInode
vfs:FileDescription
vfs:MountNamespace
vfs:Filesystem
sys:dir
kernel:FSContext
kernel:ProcessGroup
kernel:Session
shm:Shm
mm:aioMappable
mm:SpecialMappable
transport:queue

And the following use the template, but because they currently are not leak
checked, a TODO is left instead of enabling leak check in this patch:
kernel:FDTable
tun:tunEndpoint

Updates #1486.

PiperOrigin-RevId: 328460377
2020-08-25 21:04:04 -07:00
Ting-Yu Wang 47515f4751 Migrate to PacketHeader API for PacketBuffer.
Formerly, when a packet is constructed or parsed, all headers are set by the
client code. This almost always involved prepending to pk.Header buffer or
trimming pk.Data portion. This is known to prone to bugs, due to the complexity
and number of the invariants assumed across netstack to maintain.

In the new PacketHeader API, client will call Push()/Consume() method to
construct/parse an outgoing/incoming packet. All invariants, such as slicing
and trimming, are maintained by the API itself.

NewPacketBuffer() is introduced to create new PacketBuffer. Zero value is no
longer valid.

PacketBuffer now assumes the packet is a concatenation of following portions:
* LinkHeader
* NetworkHeader
* TransportHeader
* Data
Any of them could be empty, or zero-length.

PiperOrigin-RevId: 326507688
2020-08-13 13:08:57 -07:00
Nayana Bidari b2ae7ea1bb Plumbing context.Context to DecRef() and Release().
context is passed to DecRef() and Release() which is
needed for SO_LINGER implementation.

PiperOrigin-RevId: 324672584
2020-08-03 13:36:05 -07:00
Bhasker Hariharan 71bf90c55b Support for receiving outbound packets in AF_PACKET.
Updates #173

PiperOrigin-RevId: 322665518
2020-07-22 15:33:33 -07:00
Bhasker Hariharan fef90c61c6 Fix minor bugs in a couple of interface IOCTLs.
gVisor incorrectly returns the wrong ARP type for SIOGIFHWADDR. This breaks
tcpdump as it tries to interpret the packets incorrectly.

Similarly, SIOCETHTOOL is used by tcpdump to query interface properties which
fails with an EINVAL since we don't implement it. For now change it to return
EOPNOTSUPP to indicate that we don't support the query rather than return
EINVAL.

NOTE: ARPHRD types for link endpoints are distinct from NIC capabilities
and NIC flags. In Linux all 3 exist eg. ARPHRD types are stored in dev->type
field while NIC capabilities are more like the device features which can be
queried using SIOCETHTOOL but not modified and NIC Flags are fields that can
be modified from user space. eg. NIC status (UP/DOWN/MULTICAST/BROADCAST) etc.

Updates #2746

PiperOrigin-RevId: 321436525
2020-07-15 14:15:44 -07:00
Ting-Yu Wang d3a8bffe04 Pass PacketBuffer as pointer.
Historically we've been passing PacketBuffer by shallow copying through out
the stack. Right now, this is only correct as the caller would not use
PacketBuffer after passing into the next layer in netstack.

With new buffer management effort in gVisor/netstack, PacketBuffer will
own a Buffer (to be added). Internally, both PacketBuffer and Buffer may
have pointers and shallow copying shouldn't be used.

Updates #2404.

PiperOrigin-RevId: 314610879
2020-06-03 15:00:42 -07:00
Bhasker Hariharan 7e4073af12 Move tcpip.PacketBuffer and IPTables to stack package.
This is a precursor to be being able to build an intrusive list
of PacketBuffers for use in queuing disciplines being implemented.

Updates #2214

PiperOrigin-RevId: 302677662
2020-03-24 09:06:26 -07:00
Ting-Yu Wang 69da42885a Enable ARP resolution in TAP devices.
PiperOrigin-RevId: 301208471
2020-03-16 12:03:27 -07:00
Ting-Yu Wang b8f56c79be Implement tap/tun device in vfs.
PiperOrigin-RevId: 296526279
2020-02-21 15:42:56 -08:00
Adin Scannell d29e59af9f Standardize on tools directory.
PiperOrigin-RevId: 291745021
2020-01-27 12:21:00 -08:00
Adin Scannell c3b93afeaf Cleanup visibility.
PiperOrigin-RevId: 282194656
2019-11-23 23:54:41 -08:00
Adin Scannell add40fd6ad Update canonical repository.
This can be merged after:
https://github.com/google/gvisor-website/pull/77
  or
https://github.com/google/gvisor-website/pull/78

PiperOrigin-RevId: 253132620
2019-06-13 16:50:15 -07:00
Michael Pratt 4d52a55201 Change copyright notice to "The gVisor Authors"
Based on the guidelines at
https://opensource.google.com/docs/releasing/authors/.

1. $ rg -l "Google LLC" | xargs sed -i 's/Google LLC.*/The gVisor Authors./'
2. Manual fixup of "Google Inc" references.
3. Add AUTHORS file. Authors may request to be added to this file.
4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS.

Fixes #209

PiperOrigin-RevId: 245823212
Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
2019-04-29 14:26:23 -07:00
Michael Pratt 2a0c69b19f Remove license comments
Nothing reads them and they can simply get stale.

Generated with:
$ sed -i "s/licenses(\(.*\)).*/licenses(\1)/" **/BUILD

PiperOrigin-RevId: 231818945
Change-Id: Ibc3f9838546b7e94f13f217060d31f4ada9d4bf0
2019-01-31 11:12:53 -08:00
Adin Scannell 75cd70ecc9 Track paths and provide a rename hook.
This change also adds extensive testing to the p9 package via mocks. The sanity
checks and type checks are moved from the gofer into the core package, where
they can be more easily validated.

PiperOrigin-RevId: 218296768
Change-Id: I4fc3c326e7bf1e0e140a454cbacbcc6fd617ab55
2018-10-23 00:20:15 -07:00
Ian Gudger 8fce67af24 Use correct company name in copyright header
PiperOrigin-RevId: 217951017
Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
2018-10-19 16:35:11 -07:00
Googler bbee911179 Allow building on !linux
PiperOrigin-RevId: 209819644
Change-Id: I329d054bf8f4999e7db0dcd95b13f7793c65d4e2
2018-08-22 13:31:11 -07:00
Zhaozhong Ni be7fcbc558 stateify: support explicit annotation mode; convert refs and stack packages.
We have been unnecessarily creating too many savable types implicitly.

PiperOrigin-RevId: 206334201
Change-Id: Idc5a3a14bfb7ee125c4f2bb2b1c53164e46f29a8
2018-07-27 10:17:21 -07:00
Nicolas Lacasse bf0fa09537 Switch netstack licenses to Apache 2.0.
Fixes #27

PiperOrigin-RevId: 203825288
Change-Id: Ie9f3a2b2c1e296b026b024f75c07da1a7e118633
2018-07-09 14:04:40 -07:00
Nicolas Lacasse 7fd466a09e Add mising license headers.
PiperOrigin-RevId: 194676372
Change-Id: I2e7fce03e70a45f2a872160c33e05f822ae2d866
2018-04-28 18:10:24 -07:00
Googler d02b74a5dc Check in gVisor.
PiperOrigin-RevId: 194583126
Change-Id: Ica1d8821a90f74e7e745962d71801c598c652463
2018-04-28 01:44:26 -04:00