Commit Graph

32 Commits

Author SHA1 Message Date
Zach Koopmans e3fdd15932 [syserror] Update syserror to linuxerr for more errors.
Update the following from syserror to the linuxerr equivalent:
EEXIST
EFAULT
ENOTDIR
ENOTTY
EOPNOTSUPP
ERANGE
ESRCH

PiperOrigin-RevId: 384329869
2021-07-12 15:26:20 -07:00
Zach Koopmans 590b8d3e99 [syserror] Update several syserror errors to linuxerr equivalents.
Update/remove most syserror errors to linuxerr equivalents. For list
of removed errors, see //pkg/syserror/syserror.go.

PiperOrigin-RevId: 382574582
2021-07-01 12:05:19 -07:00
Zach Koopmans 6ef2684096 [syserror] Update syserror to linuxerr for EACCES, EBADF, and EPERM.
Update all instances of the above errors to the faster linuxerr implementation.
With the temporary linuxerr.Equals(), no logical changes are made.

PiperOrigin-RevId: 382306655
2021-06-30 08:18:59 -07:00
Zach Koopmans 54b71221c0 [syserror] Change syserror to linuxerr for E2BIG, EADDRINUSE, and EINVAL
Remove three syserror entries duplicated in linuxerr. Because of the
linuxerr.Equals method, this is a mere change of return values from
syserror to linuxerr definitions.

Done with only these three errnos as CLs removing all grow to a significantly
large size.

PiperOrigin-RevId: 382173835
2021-06-29 15:08:46 -07:00
Dean Deng 894187b2c6 Resolve remaining O_PATH TODOs.
O_PATH is now implemented in vfs2.

Fixes #2782.

PiperOrigin-RevId: 373861410
2021-05-14 14:04:46 -07:00
Rahat Mahmood 932c8abd0f Implement cgroupfs.
A skeleton implementation of cgroupfs. It supports trivial cpu and
memory controllers with no support for hierarchies.

PiperOrigin-RevId: 366561126
2021-04-02 21:10:44 -07:00
Bhasker Hariharan e7ca2a51a8 Add POLLRDNORM/POLLWRNORM support.
On Linux these are meant to be equivalent to POLLIN/POLLOUT. Rather
than hack these on in sys_poll etc it felt cleaner to just cleanup
the call sites to notify for both events. This is what linux does
as well.

Fixes #5544

PiperOrigin-RevId: 364859977
2021-03-24 12:11:44 -07:00
Dean Deng f6de413c39 Add cleanup TODO for integer-based proc files.
PiperOrigin-RevId: 356645022
2021-02-09 19:18:09 -08:00
Dean Deng f52f0101bb Implement F_GETLK fcntl.
Fixes #5113.

PiperOrigin-RevId: 353313374
2021-01-22 13:58:16 -08:00
Dean Deng 55332aca95 Move Lock/UnlockPOSIX into LockFD util.
PiperOrigin-RevId: 352904728
2021-01-20 16:55:07 -08:00
Adin Scannell 0a7075f38a Add basic stateify annotations.
Updates #1663

PiperOrigin-RevId: 333539293
2020-09-24 10:13:04 -07:00
Dean Deng 319d1b8ba0 Complete vfs2 implementation of fallocate.
This change includes overlay, special regular gofer files, and hostfs.

Fixes #3589.

PiperOrigin-RevId: 332330860
2020-09-17 15:38:44 -07:00
Ayush Ranjan d84ec6c42b [vfs] Capitalize x in the {Get/Set/Remove/List}xattr functions.
PiperOrigin-RevId: 330554450
2020-09-08 11:51:39 -07:00
Zach Koopmans 6a90c88b97 Port fallocate to VFS2.
PiperOrigin-RevId: 319283715
2020-07-01 13:14:44 -07:00
Dean Deng b5e814445a Fix procfs bugs in vfs2.
- Support writing on proc/[pid]/{uid,gid}map
- Return EIO for writing to static files.

Updates #2923.

PiperOrigin-RevId: 318188503
2020-06-24 19:22:12 -07:00
Fabricio Voznika 96519e2c9d Implement POSIX locks
- Change FileDescriptionImpl Lock/UnlockPOSIX signature to
  take {start,length,whence}, so the correct offset can be
  calculated in the implementations.
- Create PosixLocker interface to make it possible to share
  the same locking code from different implementations.

Closes #1480

PiperOrigin-RevId: 316910286
2020-06-17 10:04:26 -07:00
Fabricio Voznika 67565078bb Implement flock(2) in VFS2
LockFD is the generic implementation that can be embedded in
FileDescriptionImpl implementations. Unique lock ID is
maintained in vfs.FileDescription and is created on demand.

Updates #1480

PiperOrigin-RevId: 315604825
2020-06-09 18:46:42 -07:00
Dean Deng 09ddb5a426 Port extended attributes to VFS2.
As in VFS1, we only support the user.* namespace. Plumbing is added to tmpfs
and goferfs.
Note that because of the slightly different order of checks between VFS2 and
Linux, one of the xattr tests needs to be relaxed slightly.

Fixes #2363.

PiperOrigin-RevId: 305985121
2020-04-10 19:02:55 -07:00
Fabricio Voznika 2a6c4369be Enforce file size rlimits in VFS2
Updates #1035

PiperOrigin-RevId: 301255357
2020-03-16 16:00:49 -07:00
Dean Deng 960f6a975b Add plumbing for importing fds in VFS2, along with non-socket, non-TTY impl.
In VFS2, imported file descriptors are stored in a kernfs-based filesystem.
Upon calling ImportFD, the host fd can be accessed in two ways:
1. a FileDescription that can be added to the FDTable, and
2. a Dentry in the host.filesystem mount, which we will want to access through
magic symlinks in /proc/[pid]/fd/.

An implementation of the kernfs.Inode interface stores a unique host fd. This
inode can be inserted into file descriptions as well as dentries.

This change also plumbs in three FileDescriptionImpls corresponding to fds for
sockets, TTYs, and other files (only the latter is implemented here).
These implementations will mostly make corresponding syscalls to the host.
Where possible, the logic is ported over from pkg/sentry/fs/host.

Updates #1672

PiperOrigin-RevId: 299417263
2020-03-06 12:59:49 -08:00
Dean Deng 148fda60e8 Add plumbing for file locks in VFS2.
Updates #1480

PiperOrigin-RevId: 292180192
2020-01-29 11:39:28 -08:00
Fabricio Voznika 396c574db2 Add support for WritableSource in DynamicBytesFileDescriptionImpl
WritableSource is a convenience interface used for files that can
be written to, e.g. /proc/net/ipv4/tpc_sack. It reads max of 4KB
and only from offset 0 which should cover most cases. It can be
extended as neeed.

Updates #1195

PiperOrigin-RevId: 292056924
2020-01-28 18:31:28 -08:00
Adin Scannell 0e2f1b7abd Update package locations.
Because the abi will depend on the core types for marshalling (usermem,
context, safemem, safecopy), these need to be flattened from the sentry
directory. These packages contain no sentry-specific details.

PiperOrigin-RevId: 291811289
2020-01-27 15:31:32 -08:00
Ian Gudger 27500d529f New sync package.
* Rename syncutil to sync.
* Add aliases to sync types.
* Replace existing usage of standard library sync package.

This will make it easier to swap out synchronization primitives. For example,
this will allow us to use primitives from github.com/sasha-s/go-deadlock to
check for lock ordering violations.

Updates #1472

PiperOrigin-RevId: 289033387
2020-01-09 22:02:24 -08:00
Jamie Liu 1f384ac42b Add VFS2 support for device special files.
- Add FileDescriptionOptions.UseDentryMetadata, which reduces the amount of
  boilerplate needed for device FDs and the like between filesystems.

- Switch back to having FileDescription.Init() take references on the Mount and
  Dentry; otherwise managing refcounts around failed calls to
  OpenDeviceSpecialFile() / Device.Open() is tricky.

PiperOrigin-RevId: 287575574
2019-12-30 11:36:41 -08:00
Fabricio Voznika 3c125eb219 Initial procfs implementation in VFSv2
Updates #1195

PiperOrigin-RevId: 287227722
2019-12-26 14:45:35 -08:00
Jamie Liu 744401297a Add VFS2 plumbing for extended attributes.
PiperOrigin-RevId: 286281274
2019-12-18 15:48:45 -08:00
Jamie Liu 93d429d5b1 Implement memmap.MappingIdentity for vfs.FileDescription.
PiperOrigin-RevId: 285255855
2019-12-12 13:19:33 -08:00
Jamie Liu 0457a4c4cb Minor vfs.FileDescriptionImpl fixes.
- Pass context.Context to OnClose().

- Pass memmap.MMapOpts to ConfigureMMap() by pointer so that implementations
  can actually mutate it as required.

PiperOrigin-RevId: 274934967
2019-10-15 18:40:45 -07:00
Ayush Ranjan 4bab7d7f08 vfs: Remove vfs.DefaultDirectoryFD from embedding vfs.DefaultFD.
This fixes the implementation ambiguity issues when a filesystem
implementation embeds vfs.DefaultDirectoryFD to its directory FD along
with an internal common fileDescription utility.

For similar reasons also removes FileDescriptionDefaultImpl from
DynamicBytesFileDescriptionImpl.

PiperOrigin-RevId: 263795513
2019-08-16 10:20:11 -07:00
Jamie Liu cee044c2ab Add vfs.DynamicBytesFileDescriptionImpl.
This replaces fs/proc/seqfile for vfs2-based filesystems.

PiperOrigin-RevId: 263254647
2019-08-13 17:54:24 -07:00
Jamie Liu 163ab5e9ba Sentry virtual filesystem, v2
Major differences from the current ("v1") sentry VFS:

- Path resolution is Filesystem-driven (FilesystemImpl methods call
vfs.ResolvingPath methods) rather than VFS-driven (fs package owns a
Dirent tree and calls fs.InodeOperations methods to populate it). This
drastically improves performance, primarily by reducing overhead from
inefficient synchronization and indirection. It also makes it possible
to implement remote filesystem protocols that translate FS system calls
into single RPCs, rather than having to make (at least) one RPC per path
component, significantly reducing the latency of remote filesystems
(especially during cold starts and for uncacheable shared filesystems).

- Mounts are correctly represented as a separate check based on
contextual state (current mount) rather than direct replacement in a
fs.Dirent tree. This makes it possible to support (non-recursive) bind
mounts and mount namespaces.

Included in this CL is fsimpl/memfs, an incomplete in-memory filesystem
that exists primarily to demonstrate intended filesystem implementation
patterns and for benchmarking:

BenchmarkVFS1TmpfsStat/1-6               3000000               497 ns/op
BenchmarkVFS1TmpfsStat/2-6               2000000               676 ns/op
BenchmarkVFS1TmpfsStat/3-6               2000000               904 ns/op
BenchmarkVFS1TmpfsStat/8-6               1000000              1944 ns/op
BenchmarkVFS1TmpfsStat/64-6               100000             14067 ns/op
BenchmarkVFS1TmpfsStat/100-6               50000             21700 ns/op
BenchmarkVFS2MemfsStat/1-6              10000000               197 ns/op
BenchmarkVFS2MemfsStat/2-6               5000000               233 ns/op
BenchmarkVFS2MemfsStat/3-6               5000000               268 ns/op
BenchmarkVFS2MemfsStat/8-6               3000000               477 ns/op
BenchmarkVFS2MemfsStat/64-6               500000              2592 ns/op
BenchmarkVFS2MemfsStat/100-6              300000              4045 ns/op
BenchmarkVFS1TmpfsMountStat/1-6          2000000               679 ns/op
BenchmarkVFS1TmpfsMountStat/2-6          2000000               912 ns/op
BenchmarkVFS1TmpfsMountStat/3-6          1000000              1113 ns/op
BenchmarkVFS1TmpfsMountStat/8-6          1000000              2118 ns/op
BenchmarkVFS1TmpfsMountStat/64-6                  100000             14251 ns/op
BenchmarkVFS1TmpfsMountStat/100-6                 100000             22397 ns/op
BenchmarkVFS2MemfsMountStat/1-6                  5000000               317 ns/op
BenchmarkVFS2MemfsMountStat/2-6                  5000000               361 ns/op
BenchmarkVFS2MemfsMountStat/3-6                  5000000               387 ns/op
BenchmarkVFS2MemfsMountStat/8-6                  3000000               582 ns/op
BenchmarkVFS2MemfsMountStat/64-6                  500000              2699 ns/op
BenchmarkVFS2MemfsMountStat/100-6                 300000              4133 ns/op

From this we can infer that, on this machine:

- Constant cost for tmpfs stat() is ~160ns in VFS2 and ~280ns in VFS1.

- Per-path-component cost is ~35ns in VFS2 and ~215ns in VFS1, a
difference of about 6x.

- The cost of crossing a mount boundary is about 80ns in VFS2
(MemfsMountStat/1 does approximately the same amount of work as
MemfsStat/2, except that it also crosses a mount boundary). This is an
inescapable cost of the separate mount lookup needed to support bind
mounts and mount namespaces.

PiperOrigin-RevId: 258853946
2019-07-18 15:10:29 -07:00