Some CPUs(eg: ampere-emag) can speculate past an ERET instruction and potentially perform
speculative accesses to memory before processing the exception return.
Since the register state is often controlled by a lower privilege level
at the point of an ERET, this could potentially be used as part of a
side-channel attack.
Signed-off-by: Bin Lu <bin.lu@arm.com>
Also move VFS.MakeSyntheticMountpoint() (which is a utility wrapper around
VFS.MkdirAllAt(), itself a utility wrapper around VFS.MkdirAt()) to not be in
the middle of the implementation of these proc files.
Fixes#3878
PiperOrigin-RevId: 330843106
This feature is too expensive for runsc, even with setattrclunk, because
fsgofer.localFile.SetAttr() ends up needing to call reopenProcFD(), incurring
two string allocations for the FD pathname, an fd.FD allocation, and two calls
to runtime.SetFinalizer() when the fd.FD is created and closed respectively
(b/133767962) (plus the actual cost of the syscalls, which is negligible).
PiperOrigin-RevId: 330843012
Adds a Docker Compose tutorial to the website that shows how to start a
Wordpress site and includes information about how to get DNS working.
Fixes#115
PiperOrigin-RevId: 330652842
- BindSocketThenOpen test was expecting the incorrect error when opening
a socket. Fixed that.
- VirtualFilesystem.BindEndpointAt should not require pop.Path.Begin.Ok()
because the filesystem implementations do not need to walk to the parent
dentry. This check also exists for MknodAt, MkdirAt, RmdirAt, SymlinkAt and
UnlinkAt but those filesystem implementations also need to walk to the parent
denty. So that check is valid. Added some syscall tests to test this.
PiperOrigin-RevId: 330625220
The check in verity walk returns error for non ENOENT cases, and all
ENOENT results should be checked. This case was missing.
PiperOrigin-RevId: 330604771
ioctl with FS_IOC_ENABLE_VERITY is added to verity file system to enable
a file as verity file. For a file, a Merkle tree is built with its data.
For a directory, a Merkle tree is built with the root hashes of its
children.
PiperOrigin-RevId: 330604368
overlay/filesystem.go:lookupLocked() did not DecRef the VD on some error paths
when it would not end up saving or using the VD.
PiperOrigin-RevId: 330589742
Fixes pkg/tcpip/stack:stack_test flake experienced while running
TestCacheResolution with gotsan. This occurs when the test-runner takes longer
than the resolution timeout to call linkAddrCache.get.
In this test we don't care about the resolution timeout, so set it to the
maximum and rely on test-runner timeouts to avoid deadlocks.
PiperOrigin-RevId: 330566250
The existing implementation for TransportProtocol.{Set}Option take
arguments of an empty interface type which all types (implicitly)
implement; any type may be passed to the functions.
This change introduces marker interfaces for transport protocol options
that may be set or queried which transport protocol option types
implement to ensure that invalid types are caught at compile time.
Different interfaces are used to allow the compiler to enforce read-only
or set-only socket options.
RELNOTES: n/a
PiperOrigin-RevId: 330559811
The args.MountNamespaceVFS2 is used again after the nil check,
instead, mntnsVFS2 which holds the expected reference should be
used. This patch fixes this issue.
Fixes: #3855
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
This change makes the following fixes:
- When creating a test repo.key, create a secret keyring as other workflows
also use secret keyrings only.
- We should not be using both --keyring and --secret-keyring options. Just use
--secret-keyring.
- Pass homedir to all gpg commands. dpkg-sig takes an arg -g which stands for
gpgopts. So we need to pass the homedir there too.
PiperOrigin-RevId: 330443280
VFS1 and VFS2 host FDs have different dupping behavior,
making error prone to code for both. Change the contract
so that FDs are released as they are used, so the caller
can simple defer a block that closes all remaining files.
This also addresses handling of partial failures.
With this fix, more VFS2 tests can be enabled.
Updates #1487
PiperOrigin-RevId: 330112266
stack.cleanupEndpoints is protected by the stack.mu but that can cause
contention as the stack mutex is already acquired in a lot of hot paths during
new endpoint creation /cleanup etc. Moving this to a fine grained mutex should
reduce contention on the stack.mu.
PiperOrigin-RevId: 330026151
b/166980357#comment56 shows:
- 837 goroutines blocked in:
gvisor/pkg/sync/sync.(*RWMutex).Lock
gvisor/pkg/tcpip/stack/stack.(*Stack).StartTransportEndpointCleanup
gvisor/pkg/tcpip/transport/tcp/tcp.(*endpoint).cleanupLocked
gvisor/pkg/tcpip/transport/tcp/tcp.(*endpoint).completeWorkerLocked
gvisor/pkg/tcpip/transport/tcp/tcp.(*endpoint).protocolMainLoop.func1
gvisor/pkg/tcpip/transport/tcp/tcp.(*endpoint).protocolMainLoop
- 695 goroutines blocked in:
gvisor/pkg/sync/sync.(*RWMutex).Lock
gvisor/pkg/tcpip/stack/stack.(*Stack).CompleteTransportEndpointCleanup
gvisor/pkg/tcpip/transport/tcp/tcp.(*endpoint).cleanupLocked
gvisor/pkg/tcpip/transport/tcp/tcp.(*endpoint).completeWorkerLocked
gvisor/pkg/tcpip/transport/tcp/tcp.(*endpoint).protocolMainLoop.func1
gvisor/pkg/tcpip/transport/tcp/tcp.(*endpoint).protocolMainLoop
- 3882 goroutines blocked in:
gvisor/pkg/sync/sync.(*RWMutex).Lock
gvisor/pkg/tcpip/stack/stack.(*Stack).GetTCPProbe
gvisor/pkg/tcpip/transport/tcp/tcp.newEndpoint
gvisor/pkg/tcpip/transport/tcp/tcp.(*protocol).NewEndpoint
gvisor/pkg/tcpip/stack/stack.(*Stack).NewEndpoint
All of these are contending on Stack.mu. Stack.StartTransportEndpointCleanup()
and Stack.CompleteTransportEndpointCleanup() insert/delete TransportEndpoints
in a map (Stack.cleanupEndpoints), and the former also does endpoint
unregistration while holding Stack.mu, so it's not immediately clear how
feasible it is to replace the map with a mutex-less implementation or how much
doing so would help. However, Stack.GetTCPProbe() just reads a function object
(Stack.tcpProbeFunc) that is almost always nil (as far as I can tell,
Stack.AddTCPProbe() is only called in tests), and it's called for every new TCP
endpoint. So converting it to an atomic.Value should significantly reduce
contention on Stack.mu, improving TCP endpoint creation latency and allowing
TCP endpoint cleanup to proceed.
PiperOrigin-RevId: 330004140
Update the cniVersion used in the CNI tutorial so that it works with
containerd 1.2. Containerd 1.2 includes a version of the cri plugin
(release/1.2) that, in turn, includes a version of the
cni library (0.6.0) that only supports up to 0.3.1.
https://github.com/containernetworking/cni/blob/v0.6.0/pkg/version/version.go#L38
PiperOrigin-RevId: 329837188
Accept on gVisor will return an error if a socket in the accept queue was closed
before Accept() was called. Linux will return the new fd even if the returned
socket is already closed by the peer say due to a RST being sent by the peer.
This seems to be intentional in linux more details on the github issue.
Fixes#3780
PiperOrigin-RevId: 329828404
Adds docs to nginx and refactors both Httpd and Nginx benchmarks.
Key changes:
- Add docs and make nginx tests the same as httpd (reverse, all docs, etc.).
- Make requests scale on c * b.N -> a request per thread. This works well
with both --test.benchtime=10m (do a run that lasts at least 10m) and
--test.benchtime=10x (do b.N = 10).
-- Remove a doc from both tests (1000Kb) as 1024Kb exists.
PiperOrigin-RevId: 329751091
This is to cover the common pattern: open->read/write->close,
where SetAttr needs to be called to update atime/mtime before
the file is closed.
Benchmark results:
BM_OpenReadClose/10240 CPU
setattr+clunk: 63783 ns
VFS2: 68109 ns
VFS1: 72507 ns
Updates #1198
PiperOrigin-RevId: 329628461
On receiving an ACK with unacceptable ACK number, in a closing state,
TCP, needs to reply back with an ACK with correct seq and ack numbers and
remain in same state. This change is as per RFC793 page 37, but with a
difference that it does not apply to ESTABLISHED state, just as in Linux.
Also add more tests to check for OTW sequence number and unacceptable
ack numbers in these states.
Fixes#3785
PiperOrigin-RevId: 329616283
This allows runsc flags to be set per sandbox instance. For
example, K8s pod annotations can be used to enable
--debug for a single pod, making troubleshoot much easier.
Similarly, features like --vfs2 can be enabled for
experimentation without affecting other pods in the node.
Closes#3494
PiperOrigin-RevId: 329542815