Commit Graph

4090 Commits

Author SHA1 Message Date
Ian Lewis 26439f9a43 Add syntax highlighting to website
Adds a syntax highlighting theme css so that code snippets are highlighted
properly.

PiperOrigin-RevId: 330733737
2020-09-09 09:08:37 -07:00
Ian Lewis 00479af515 Add a Docker Compose tutorial
Adds a Docker Compose tutorial to the website that shows how to start a
Wordpress site and includes information about how to get DNS working.

Fixes #115

PiperOrigin-RevId: 330652842
2020-09-08 21:59:24 -07:00
Jamie Liu 8d3551da6a Implement synthetic mountpoints for kernfs.
PiperOrigin-RevId: 330629897
2020-09-08 18:33:03 -07:00
Ayush Ranjan bca4d99a4b [vfs] overlayfs: Fix socket tests.
- BindSocketThenOpen test was expecting the incorrect error when opening
  a socket. Fixed that.
- VirtualFilesystem.BindEndpointAt should not require pop.Path.Begin.Ok()
  because the filesystem implementations do not need to walk to the parent
  dentry. This check also exists for MknodAt, MkdirAt, RmdirAt, SymlinkAt and
  UnlinkAt but those filesystem implementations also need to walk to the parent
  denty. So that check is valid. Added some syscall tests to test this.

PiperOrigin-RevId: 330625220
2020-09-08 17:56:22 -07:00
gVisor bot a17d083f3b Add check for both child and childMerkle ENOENT
The check in verity walk returns error for non ENOENT cases, and all
ENOENT results should be checked. This case was missing.

PiperOrigin-RevId: 330604771
2020-09-08 16:01:10 -07:00
gVisor bot 360f1535c7 Implement ioctl with enable verity
ioctl with FS_IOC_ENABLE_VERITY is added to verity file system to enable
a file as verity file. For a file, a Merkle tree is built with its data.
For a directory, a Merkle tree is built with the root hashes of its
children.

PiperOrigin-RevId: 330604368
2020-09-08 15:54:21 -07:00
Ayush Ranjan 682c0edcdc [vfs] overlayfs: decref VD when not using it.
overlay/filesystem.go:lookupLocked() did not DecRef the VD on some error paths
when it would not end up saving or using the VD.

PiperOrigin-RevId: 330589742
2020-09-08 14:42:39 -07:00
Fabricio Voznika c8f1ce288d Honor readonly flag for root mount
Updates #1487

PiperOrigin-RevId: 330580699
2020-09-08 14:00:43 -07:00
Sam Balana 284e6811e4 Increase resolution timeout for TestCacheResolution
Fixes pkg/tcpip/stack:stack_test flake experienced while running
TestCacheResolution with gotsan. This occurs when the test-runner takes longer
than the resolution timeout to call linkAddrCache.get.

In this test we don't care about the resolution timeout, so set it to the
maximum and rely on test-runner timeouts to avoid deadlocks.

PiperOrigin-RevId: 330566250
2020-09-08 12:52:10 -07:00
gVisor bot a3b87a0cef Merge pull request #3856 from btw616:fix/issue-3855
PiperOrigin-RevId: 330565414
2020-09-08 12:46:25 -07:00
Bhasker Hariharan 38cdb0579b Fix data race in tcp.GetSockOpt.
e.ID can't be read without holding e.mu. GetSockOpt was reading e.ID
when looking up OriginalDst without holding e.mu.

PiperOrigin-RevId: 330562293
2020-09-08 12:31:19 -07:00
Ghanan Gowripalan d35f07b36a Improve type safety for transport protocol options
The existing implementation for TransportProtocol.{Set}Option take
arguments of an empty interface type which all types (implicitly)
implement; any type may be passed to the functions.

This change introduces marker interfaces for transport protocol options
that may be set or queried which transport protocol option types
implement to ensure that invalid types are caught at compile time.
Different interfaces are used to allow the compiler to enforce read-only
or set-only socket options.

RELNOTES: n/a
PiperOrigin-RevId: 330559811
2020-09-08 12:17:39 -07:00
Ayush Ranjan d84ec6c42b [vfs] Capitalize x in the {Get/Set/Remove/List}xattr functions.
PiperOrigin-RevId: 330554450
2020-09-08 11:51:39 -07:00
Tiwei Bie ceab2e21de Fix the use after nil check on args.MountNamespaceVFS2
The args.MountNamespaceVFS2 is used again after the nil check,
instead, mntnsVFS2 which holds the expected reference should be
used. This patch fixes this issue.

Fixes: #3855

Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
2020-09-08 15:50:29 +08:00
Ayush Ranjan fada564c83 Fix make_apt script.
This change makes the following fixes:
- When creating a test repo.key, create a secret keyring as other workflows
  also use secret keyrings only.
- We should not be using both --keyring and --secret-keyring options. Just use
  --secret-keyring.
- Pass homedir to all gpg commands. dpkg-sig takes an arg -g which stands for
  gpgopts. So we need to pass the homedir there too.

PiperOrigin-RevId: 330443280
2020-09-07 21:18:22 -07:00
Fabricio Voznika 2202812e07 Simplify FD handling for container start/exec
VFS1 and VFS2 host FDs have different dupping behavior,
making error prone to code for both. Change the contract
so that FDs are released as they are used, so the caller
can simple defer a block that closes all remaining files.
This also addresses handling of partial failures.

With this fix, more VFS2 tests can be enabled.

Updates #1487

PiperOrigin-RevId: 330112266
2020-09-04 11:42:02 -07:00
Dean Deng c564293b65 Adjust input file offset when sendfile only completes a partial write.
Fixes #3779.

PiperOrigin-RevId: 330057268
2020-09-03 23:30:47 -07:00
Ayush Ranjan b6d6a120d0 Fix the release workflow.
PiperOrigin-RevId: 330049242
2020-09-03 21:45:10 -07:00
Bhasker Hariharan 805861ca37 Use fine-grained mutex for stack.cleanupEndpoints.
stack.cleanupEndpoints is protected by the stack.mu but that can cause
contention as the stack mutex is already acquired in a lot of hot paths during
new endpoint creation /cleanup etc. Moving this to a fine grained mutex should
reduce contention on the stack.mu.

PiperOrigin-RevId: 330026151
2020-09-03 17:36:41 -07:00
Jamie Liu 76e51c8b9a Use atomic.Value for Stack.tcpProbeFunc.
b/166980357#comment56 shows:

- 837 goroutines blocked in:
gvisor/pkg/sync/sync.(*RWMutex).Lock
gvisor/pkg/tcpip/stack/stack.(*Stack).StartTransportEndpointCleanup
gvisor/pkg/tcpip/transport/tcp/tcp.(*endpoint).cleanupLocked
gvisor/pkg/tcpip/transport/tcp/tcp.(*endpoint).completeWorkerLocked
gvisor/pkg/tcpip/transport/tcp/tcp.(*endpoint).protocolMainLoop.func1
gvisor/pkg/tcpip/transport/tcp/tcp.(*endpoint).protocolMainLoop

- 695 goroutines blocked in:
gvisor/pkg/sync/sync.(*RWMutex).Lock
gvisor/pkg/tcpip/stack/stack.(*Stack).CompleteTransportEndpointCleanup
gvisor/pkg/tcpip/transport/tcp/tcp.(*endpoint).cleanupLocked
gvisor/pkg/tcpip/transport/tcp/tcp.(*endpoint).completeWorkerLocked
gvisor/pkg/tcpip/transport/tcp/tcp.(*endpoint).protocolMainLoop.func1
gvisor/pkg/tcpip/transport/tcp/tcp.(*endpoint).protocolMainLoop

- 3882 goroutines blocked in:
gvisor/pkg/sync/sync.(*RWMutex).Lock
gvisor/pkg/tcpip/stack/stack.(*Stack).GetTCPProbe
gvisor/pkg/tcpip/transport/tcp/tcp.newEndpoint
gvisor/pkg/tcpip/transport/tcp/tcp.(*protocol).NewEndpoint
gvisor/pkg/tcpip/stack/stack.(*Stack).NewEndpoint

All of these are contending on Stack.mu. Stack.StartTransportEndpointCleanup()
and Stack.CompleteTransportEndpointCleanup() insert/delete TransportEndpoints
in a map (Stack.cleanupEndpoints), and the former also does endpoint
unregistration while holding Stack.mu, so it's not immediately clear how
feasible it is to replace the map with a mutex-less implementation or how much
doing so would help. However, Stack.GetTCPProbe() just reads a function object
(Stack.tcpProbeFunc) that is almost always nil (as far as I can tell,
Stack.AddTCPProbe() is only called in tests), and it's called for every new TCP
endpoint. So converting it to an atomic.Value should significantly reduce
contention on Stack.mu, improving TCP endpoint creation latency and allowing
TCP endpoint cleanup to proceed.

PiperOrigin-RevId: 330004140
2020-09-03 15:24:48 -07:00
Nicolas Lacasse 30c20df76f Run gentdents_benchmark with fewer files.
This test regularly times out when "shared" filesystem is enabled.

PiperOrigin-RevId: 329950622
2020-09-03 10:54:01 -07:00
Tamir Duberstein 319ce67369 Avoid grpc_impl
PiperOrigin-RevId: 329902747
2020-09-03 05:52:17 -07:00
Ian Lewis a8c174c047 Update version in cni tutorial
Update the cniVersion used in the CNI tutorial so that it works with
containerd 1.2. Containerd 1.2 includes a version of the cri plugin
(release/1.2) that, in turn, includes a version of the
cni library (0.6.0) that only supports up to 0.3.1.
https://github.com/containernetworking/cni/blob/v0.6.0/pkg/version/version.go#L38

PiperOrigin-RevId: 329837188
2020-09-02 19:38:34 -07:00
Zeling Feng 86c1ae095a Add support to run packetimpact tests against Fuchsia
blaze test <test_name>_fuchsia_test will run the corresponding packetimpact
test against fuchsia.

PiperOrigin-RevId: 329835290
2020-09-02 19:19:40 -07:00
Bhasker Hariharan b69352245a Fix Accept to not return error for sockets in accept queue.
Accept on gVisor will return an error if a socket in the accept queue was closed
before Accept() was called. Linux will return the new fd even if the returned
socket is already closed by the peer say due to a RST being sent by the peer.

This seems to be intentional in linux more details on the github issue.

Fixes #3780

PiperOrigin-RevId: 329828404
2020-09-02 18:21:47 -07:00
Ayush Ranjan 1fec861939 [vfs] Implement xattr for overlayfs.
PiperOrigin-RevId: 329825497
2020-09-02 17:58:05 -07:00
Ayush Ranjan 0ca0d8e011 [vfs] Fix error handling in overlayfs OpenAt.
Updates #1199

PiperOrigin-RevId: 329802274
2020-09-02 15:43:13 -07:00
Jamie Liu 5c66011200 Update Go version constraint on sync/spin_unsafe.go.
PiperOrigin-RevId: 329801584
2020-09-02 15:37:26 -07:00
Jamie Liu 9bd0164237 Improve sync.SeqCount performance.
- Make sync.SeqCountEpoch not a struct. This allows sync.SeqCount.BeginRead()
  to be inlined.

- Mark sync.SeqAtomicLoad<T> nosplit to mitigate the Go compiler's refusal to
  inline it. (Best I could get was "cost 92 exceeds budget 80".)

- Use runtime-guided spinning in SeqCount.BeginRead().

Benchmarks:
name                               old time/op  new time/op   delta
pkg:pkg/sync/sync goos:linux goarch:amd64
SeqCountWriteUncontended-12        8.24ns ± 0%  11.40ns ± 0%  +38.35%  (p=0.000 n=10+10)
SeqCountReadUncontended-12         0.33ns ± 0%   0.14ns ± 3%  -57.77%  (p=0.000 n=7+8)
pkg:pkg/sync/seqatomictest/seqatomic goos:linux goarch:amd64
SeqAtomicLoadIntUncontended-12     0.64ns ± 1%   0.41ns ± 1%  -36.40%  (p=0.000 n=10+8)
SeqAtomicTryLoadIntUncontended-12  0.18ns ± 4%   0.18ns ± 1%     ~     (p=0.206 n=10+8)
AtomicValueLoadIntUncontended-12   0.27ns ± 3%   0.27ns ± 0%   -1.77%  (p=0.000 n=10+8)

(atomic.Value.Load is, of course, inlined. We would expect an uncontended
inline SeqAtomicLoad<int> to perform identically to SeqAtomicTryLoad<int>.) The
"regression" in BenchmarkSeqCountWriteUncontended, despite this CL changing
nothing in that path, is attributed to microarchitectural subtlety; the
benchmark loop is unchanged except for its address:

Before this CL:
  :0                    0x4e62d1                48ffc2                  INCQ DX
  :0                    0x4e62d4                48399110010000          CMPQ DX, 0x110(CX)
  :0                    0x4e62db                7e26                    JLE 0x4e6303
  :0                    0x4e62dd                90                      NOPL
  :0                    0x4e62de                bb01000000              MOVL $0x1, BX
  :0                    0x4e62e3                f00fc118                LOCK XADDL BX, 0(AX)
  :0                    0x4e62e7                ffc3                    INCL BX
  :0                    0x4e62e9                0fbae300                BTL $0x0, BX
  :0                    0x4e62ed                733a                    JAE 0x4e6329
  :0                    0x4e62ef                90                      NOPL
  :0                    0x4e62f0                bb01000000              MOVL $0x1, BX
  :0                    0x4e62f5                f00fc118                LOCK XADDL BX, 0(AX)
  :0                    0x4e62f9                ffc3                    INCL BX
  :0                    0x4e62fb                0fbae300                BTL $0x0, BX
  :0                    0x4e62ff                73d0                    JAE 0x4e62d1

After this CL:
  :0                    0x4e6361                48ffc2                  INCQ DX
  :0                    0x4e6364                48399110010000          CMPQ DX, 0x110(CX)
  :0                    0x4e636b                7e26                    JLE 0x4e6393
  :0                    0x4e636d                90                      NOPL
  :0                    0x4e636e                bb01000000              MOVL $0x1, BX
  :0                    0x4e6373                f00fc118                LOCK XADDL BX, 0(AX)
  :0                    0x4e6377                ffc3                    INCL BX
  :0                    0x4e6379                0fbae300                BTL $0x0, BX
  :0                    0x4e637d                733a                    JAE 0x4e63b9
  :0                    0x4e637f                90                      NOPL
  :0                    0x4e6380                bb01000000              MOVL $0x1, BX
  :0                    0x4e6385                f00fc118                LOCK XADDL BX, 0(AX)
  :0                    0x4e6389                ffc3                    INCL BX
  :0                    0x4e638b                0fbae300                BTL $0x0, BX
  :0                    0x4e638f                73d0                    JAE 0x4e6361

PiperOrigin-RevId: 329754148
2020-09-02 11:37:31 -07:00
Zach Koopmans b9b6660dc4 Add Docs to nginx benchmark.
Adds docs to nginx and refactors both Httpd and Nginx benchmarks.

Key changes:
- Add docs and make nginx tests the same as httpd (reverse, all docs, etc.).
- Make requests scale on c * b.N -> a request per thread. This works well
with both --test.benchtime=10m (do a run that lasts at least 10m) and
--test.benchtime=10x (do b.N = 10).
-- Remove a doc from both tests (1000Kb) as 1024Kb exists.

PiperOrigin-RevId: 329751091
2020-09-02 11:22:17 -07:00
Ayush Ranjan 8ab08cdc01 [runtime tests] Exclude flaky nodejs test
PiperOrigin-RevId: 329749191
2020-09-02 11:13:02 -07:00
gVisor bot a0e4310384 Merge pull request #3822 from btw616:fix/issue-3821
PiperOrigin-RevId: 329710371
2020-09-02 07:42:19 -07:00
Zach Koopmans 563f28b7d5 Fix statfs test for opensource.
PiperOrigin-RevId: 329638946
2020-09-01 21:03:48 -07:00
Fabricio Voznika 37a217aca4 Implement setattr+clunk in 9P
This is to cover the common pattern: open->read/write->close,
where SetAttr needs to be called to update atime/mtime before
the file is closed.

Benchmark results:

BM_OpenReadClose/10240 CPU
setattr+clunk: 63783 ns
VFS2:          68109 ns
VFS1:          72507 ns

Updates #1198

PiperOrigin-RevId: 329628461
2020-09-01 19:22:12 -07:00
Mithun Iyer 40faeaa180 Fix handling of unacceptable ACKs during close.
On receiving an ACK with unacceptable ACK number, in a closing state,
TCP, needs to reply back with an ACK with correct seq and ack numbers and
remain in same state. This change is as per RFC793 page 37, but with a
difference that it does not apply to ESTABLISHED state, just as in Linux.
Also add more tests to check for OTW sequence number and unacceptable
ack numbers in these states.

Fixes #3785

PiperOrigin-RevId: 329616283
2020-09-01 17:45:04 -07:00
Dean Deng c67d8ece09 Test opening file handles with different permissions.
These were problematic for vfs2 gofers before correctly implementing separate
read/write handles.

PiperOrigin-RevId: 329613261
2020-09-01 17:21:22 -07:00
Ayush Ranjan 2eaf54dd59 Refactor tty codebase to use master-replica terminology.
Updates #2972

PiperOrigin-RevId: 329584905
2020-09-01 14:43:41 -07:00
Nayana Bidari 04c284f8c2 Fix panic when calling dup2().
PiperOrigin-RevId: 329572337
2020-09-01 13:41:01 -07:00
Ayush Ranjan 723fb5c116 [go-marshal] Enable auto-marshalling for fs/tty.
PiperOrigin-RevId: 329564614
2020-09-01 13:02:17 -07:00
Fabricio Voznika 71589b7f7e Let flags be overriden from OCI annotations
This allows runsc flags to be set per sandbox instance. For
example, K8s pod annotations can be used to enable
--debug for a single pod, making troubleshoot much easier.
Similarly, features like --vfs2 can be enabled for
experimentation without affecting other pods in the node.

Closes #3494

PiperOrigin-RevId: 329542815
2020-09-01 11:12:19 -07:00
Nayana Bidari 0eae08bc9e Automated rollback of changelist 328350576
PiperOrigin-RevId: 329526153
2020-09-01 09:54:55 -07:00
Ian Lewis f4be726fde Use 1080p background image.
This makes the background image on the top page 1/3 as big and allows it to
load in roughly half the time.

PiperOrigin-RevId: 329462030
2020-09-01 01:28:19 -07:00
Tiwei Bie 66ee7c0e98 Dup stdio FDs for VFS2 when starting a child container
Currently the stdio FDs are not dupped and will be closed
unexpectedly in VFS2 when starting a child container. This
patch fixes this issue.

Fixes: #3821

Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
2020-09-01 15:51:08 +08:00
Zach Koopmans 6748438493 Fix bug in bazel build benchmark.
PiperOrigin-RevId: 329409802
2020-08-31 17:17:09 -07:00
Adin Scannell 101c97d6f8 Change nogo failures to test failures, instead of build failures.
PiperOrigin-RevId: 329408633
2020-08-31 17:09:20 -07:00
Jay Zhuang 170560cec0 Set errno on response when syscall actually fails
This prevents setting stale errno on responses.

Also fixes TestDiscardsUDPPacketsWithMcastSourceAddressV6 to use correct
multicast addresses in test.

Fixes #3793

PiperOrigin-RevId: 329391155
2020-08-31 15:31:42 -07:00
Jamie Liu 6cdfa4fee0 Don't use read-only host FD for writable gofer dentries in VFS2.
As documented for gofer.dentry.hostFD.

PiperOrigin-RevId: 329372319
2020-08-31 13:57:19 -07:00
Tamir Duberstein 9d0d82088a Remove __fuchsia__ defines
These mostly guard linux-only headers; check for linux instead.

PiperOrigin-RevId: 329362762
2020-08-31 13:10:56 -07:00
gVisor bot 911cecaa34 Implement walk in gvisor verity fs
Implement walk directories in gvisor verity file system. For each step,
the child dentry is verified against a verified parent root hash.

PiperOrigin-RevId: 329358747
2020-08-31 12:52:21 -07:00
Ting-Yu Wang ba25485d96 stateify: Bring back struct field and type names in pretty print
PiperOrigin-RevId: 329349158
2020-08-31 12:06:00 -07:00