Commit Graph

224 Commits

Author SHA1 Message Date
Ayush Ranjan 6240abb205 Added boilerplate code for ext4 fs.
Initialized BUILD with license
Mount is still unimplemented and is not meant to be
part of this CL. Rest of the fs interface is implemented.
Referenced the Linux kernel appropriately when needed

PiperOrigin-RevId: 249741997
Change-Id: Id1e4c7c9e68b3f6946da39896fc6a0c3dcd7f98c
2019-05-23 16:55:42 -07:00
Fabricio Voznika 9006304dfe Initial support for bind mounts
Separate MountSource from Mount. This is needed to allow
mounts to be shared by multiple containers within the same
pod.

PiperOrigin-RevId: 249617810
Change-Id: Id2944feb7e4194951f355cbe6d4944ae3c02e468
2019-05-23 04:16:10 -07:00
Adin Scannell 21915eb58b Remove obsolete TODO.
There no obvious reason to require that BlockSize and StatFS
are MountSource operations. Today they are in INodeOperations,
and they can be moved elsewhere in the future as part of a
normal refactor process.

PiperOrigin-RevId: 249549982
Change-Id: Ib832e02faeaf8253674475df4e385bcc53d780f3
2019-05-22 17:00:36 -07:00
Adin Scannell 9cdae51fec Add basic plumbing for splice and stub implementation.
This does not actually implement an efficient splice or sendfile. Rather, it
adds a generic plumbing to the file internals so that this can be added. All
file implementations use the stub fileutil.NoSplice implementation, which
causes sendfile and splice to fall back to an internal copy.

A basic splice system call interface is added, along with a test.

PiperOrigin-RevId: 249335960
Change-Id: Ic5568be2af0a505c19e7aec66d5af2480ab0939b
2019-05-21 15:18:12 -07:00
Neel Natu adeb99709b Remove unused struct member.
Remove unused struct member.

PiperOrigin-RevId: 249300446
Change-Id: Ifb16538f684bc3200342462c3da927eb564bf52d
2019-05-21 12:20:19 -07:00
Michael Pratt 80cc2c78e5 Forward named pipe creation to the gofer
The backing 9p server must allow named pipe creation, which the runsc
fsgofer currently does not.

There are small changes to the overlay here. GetFile may block when
opening a named pipe, which can cause a deadlock:

1. open(O_RDONLY) -> copyMu.Lock() -> GetFile()
2. open(O_WRONLY) -> copyMu.Lock() -> Deadlock

A named pipe usable for writing must already be on the upper filesystem,
but we are still taking copyMu for write when checking for upper. That
can be changed to a read lock to fix the common case.

However, a named pipe on the lower filesystem would still deadlock in
open(O_WRONLY) when it tries to actually perform copy up (which would
simply return EINVAL). Move the copy up type check before taking copyMu
for write to avoid this.

p9 must be modified, as it was incorrectly removing the file mode when
sending messages on the wire.

PiperOrigin-RevId: 249154033
Change-Id: Id6637130e567b03758130eb6c7cdbc976384b7d6
2019-05-20 16:53:08 -07:00
Michael Pratt 6588427451 Fix incorrect tmpfs timestamp updates
* Creation of files, directories (and other fs objects) in a directory
  should always update ctime.
* Same for removal.
* atime should not be updated on lookup, only readdir.

I've also renamed some misleading functions that update mtime and ctime.

PiperOrigin-RevId: 249115063
Change-Id: I30fa275fa7db96d01aa759ed64628c18bb3a7dc7
2019-05-20 13:35:17 -07:00
Michael Pratt 4a842836e5 Return EPERM for mknod
This more directly matches what Linux does with unsupported
nodes.

PiperOrigin-RevId: 248780425
Change-Id: I17f3dd0b244f6dc4eb00e2e42344851b8367fbec
2019-05-17 13:47:40 -07:00
Michael Pratt 04105781ad Fix gofer rename ctime and cleanup stat_times test
There is a lot of redundancy that we can simplify in the stat_times
test. This will make it easier to add new tests. However, the
simplification reveals that cached uattrs on goferfs don't properly
update ctime on rename.

PiperOrigin-RevId: 248773425
Change-Id: I52662728e1e9920981555881f9a85f9ce04041cf
2019-05-17 13:05:47 -07:00
Andrei Vagin 2105158d4b gofer: don't call hostfile.Close if hostFile is nil
PiperOrigin-RevId: 248437159
Change-Id: Ife71f6ca032fca59ec97a82961000ed0af257101
2019-05-15 17:21:10 -07:00
Nicolas Lacasse dd153c014d Start of support for /proc/pid/cgroup file.
PiperOrigin-RevId: 248263378
Change-Id: Ic057d2bb0b6212110f43ac4df3f0ac9bf931ab98
2019-05-14 20:34:50 -07:00
Michael Pratt 330a1bbd04 Remove false comment
PiperOrigin-RevId: 248249285
Change-Id: I9b6d267baa666798b22def590ff20c9a118efd47
2019-05-14 18:06:14 -07:00
Fabricio Voznika 1bee43be13 Implement fallocate(2)
Closes #225

PiperOrigin-RevId: 247508791
Change-Id: I04f47cf2770b30043e5a272aba4ba6e11d0476cc
2019-05-09 15:35:49 -07:00
Nicolas Lacasse bfd9f75ba4 Set the FilesytemType in MountSource from the Filesystem.
And stop storing the Filesystem in the MountSource.

This allows us to decouple the MountSource filesystem type from the name of the
filesystem.

PiperOrigin-RevId: 247292982
Change-Id: I49cbcce3c17883b7aa918ba76203dfd6d1b03cc8
2019-05-08 14:35:06 -07:00
Fabricio Voznika e5432fa1b3 Remove defers from gofer.contextFile
Most are single line methods in hot paths.

PiperOrigin-RevId: 247050267
Change-Id: I428d78723fe00b57483185899dc8fa9e1f01e2ea
2019-05-07 10:55:09 -07:00
Andrei Vagin 24d8656585 gofer: don't leak file descriptors
Fixes #219

PiperOrigin-RevId: 246568639
Change-Id: Ic7afd15dde922638d77f6429c508d1cbe2e4288a
2019-05-03 14:01:50 -07:00
Michael Pratt 23ca9886c6 Update reference to old type
PiperOrigin-RevId: 246036806
Change-Id: I5554a43a1f8146c927402db3bf98488a2da0fbe7
2019-04-30 15:42:39 -07:00
Jamie Liu 8bfb83d0ac Implement async MemoryFile eviction, and use it in CachingInodeOperations.
This feature allows MemoryFile to delay eviction of "optional"
allocations, such as unused cached file pages.

Note that this incidentally makes CachingInodeOperations writeback
asynchronous, in the sense that it doesn't occur until eviction; this is
necessary because between when a cached page becomes evictable and when
it's evicted, file writes (via CachingInodeOperations.Write) may dirty
the page.

As currently implemented, this feature won't meaningfully impact
steady-state memory usage or caching; the reclaimer goroutine will
schedule eviction as soon as it runs out of other work to do. Future CLs
increase caching by adding constraints on when eviction is scheduled.

PiperOrigin-RevId: 246014822
Change-Id: Ia85feb25a2de92a48359eb84434b6ec6f9bea2cb
2019-04-30 13:56:41 -07:00
Ian Gudger 81ecd8b6ea Implement the MSG_CTRUNC msghdr flag for Unix sockets.
Updates google/gvisor#206

PiperOrigin-RevId: 245880573
Change-Id: Ifa715e98d47f64b8a32b04ae9378d6cd6bd4025e
2019-04-29 21:21:08 -07:00
Michael Pratt 4d52a55201 Change copyright notice to "The gVisor Authors"
Based on the guidelines at
https://opensource.google.com/docs/releasing/authors/.

1. $ rg -l "Google LLC" | xargs sed -i 's/Google LLC.*/The gVisor Authors./'
2. Manual fixup of "Google Inc" references.
3. Add AUTHORS file. Authors may request to be added to this file.
4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS.

Fixes #209

PiperOrigin-RevId: 245823212
Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
2019-04-29 14:26:23 -07:00
Nicolas Lacasse f4ce43e1f4 Allow and document bug ids in gVisor codebase.
PiperOrigin-RevId: 245818639
Change-Id: I03703ef0fb9b6675955637b9fe2776204c545789
2019-04-29 14:04:14 -07:00
Kevin Krakauer 5f13338d30 Fix reference counting bug in /proc/PID/fdinfo/.
PiperOrigin-RevId: 245452217
Change-Id: I7164d8f57fe34c17e601079eb9410a6d95af1869
2019-04-26 11:09:55 -07:00
Jamie Liu 6b76c172b4 Don't enforce NAME_MAX in fs.Dirent.walk().
Maximum filename length is filesystem-dependent, and obtained via
statfs::f_namelen. This limit is usually 255 bytes (NAME_MAX), but not
always. For example, VFAT supports filenames of up to 255... UCS-2
characters, which Linux conservatively takes to mean UTF-8-encoded
bytes: fs/fat/inode.c:fat_statfs(), FAT_LFN_LEN * NLS_MAX_CHARSET_SIZE.
As a result, Linux's VFS does not enforce NAME_MAX:

$ rg --maxdepth=1 '\WNAME_MAX\W' fs/ include/linux/
fs/libfs.c
38:     buf->f_namelen = NAME_MAX;
64:     if (dentry->d_name.len > NAME_MAX)

include/linux/relay.h
74:     char base_filename[NAME_MAX];   /* saved base filename */

include/linux/fscrypt.h
149: * filenames up to NAME_MAX bytes, since base64 encoding expands the length.

include/linux/exportfs.h
176: *    understanding that it is already pointing to a a %NAME_MAX+1 sized

Remove this check from core VFS, and add it to ramfs (and by extension
tmpfs), where it is actually applicable:
mm/shmem.c:shmem_dir_inode_operations.lookup == simple_lookup *does*
enforce NAME_MAX.

PiperOrigin-RevId: 245324748
Change-Id: I17567c4324bfd60e31746a5270096e75db963fac
2019-04-25 16:05:13 -07:00
Michael Pratt d6aac9387f Fix doc typo
PiperOrigin-RevId: 244773890
Change-Id: I2d0cd7789771276ba545b38efff6d3e24133baaa
2019-04-22 18:22:19 -07:00
Fabricio Voznika c8cee7108f Use FD limit and file size limit from host
FD limit and file size limit is read from the host, instead
of using hard-coded defaults, given that they effect the sandbox
process. Also limit the direct cache to use no more than half
if the available FDs.

PiperOrigin-RevId: 244050323
Change-Id: I787ad0fdf07c49d589e51aebfeae477324fe26e6
2019-04-17 12:57:40 -07:00
Jamie Liu 4209edafb6 Use open fids when fstat()ing gofer files.
PiperOrigin-RevId: 243018347
Change-Id: I1e5b80607c1df0747482abea61db7fcf24536d37
2019-04-11 00:43:04 -07:00
Nicolas Lacasse d93d19fd4e Fix uses of RootFromContext.
RootFromContext can return a dirent with reference taken, or nil. We must call
DecRef if (and only if) a real dirent is returned.

PiperOrigin-RevId: 242965515
Change-Id: Ie2b7b4cb19ee09b6ccf788b71f3fd7efcdf35a11
2019-04-10 16:36:28 -07:00
Yong He 89cc8eef9b DATA RACE in fs.(*Dirent).fullName
add renameMu.Lock when oldParent == newParent
in order to avoid data race in following report:

WARNING: DATA RACE
Read at 0x00c000ba2160 by goroutine 405:
  gvisor.googlesource.com/gvisor/pkg/sentry/fs.(*Dirent).fullName()
      pkg/sentry/fs/dirent.go:246 +0x6c
  gvisor.googlesource.com/gvisor/pkg/sentry/fs.(*Dirent).FullName()
      pkg/sentry/fs/dirent.go:356 +0x8b
  gvisor.googlesource.com/gvisor/pkg/sentry/kernel.(*FDMap).String()
      pkg/sentry/kernel/fd_map.go:135 +0x1e0
  fmt.(*pp).handleMethods()
      GOROOT/src/fmt/print.go:603 +0x404
  fmt.(*pp).printArg()
      GOROOT/src/fmt/print.go:686 +0x255
  fmt.(*pp).doPrintf()
      GOROOT/src/fmt/print.go:1003 +0x33f
  fmt.Fprintf()
      GOROOT/src/fmt/print.go:188 +0x7f
  gvisor.googlesource.com/gvisor/pkg/log.(*Writer).Emit()
      pkg/log/log.go:121 +0x89
  gvisor.googlesource.com/gvisor/pkg/log.GoogleEmitter.Emit()
      pkg/log/glog.go:162 +0x1acc
  gvisor.googlesource.com/gvisor/pkg/log.(*GoogleEmitter).Emit()
      <autogenerated>:1 +0xe1
  gvisor.googlesource.com/gvisor/pkg/log.(*BasicLogger).Debugf()
      pkg/log/log.go:177 +0x111
  gvisor.googlesource.com/gvisor/pkg/log.Debugf()
      pkg/log/log.go:235 +0x66
  gvisor.googlesource.com/gvisor/pkg/sentry/kernel.(*Task).Debugf()
      pkg/sentry/kernel/task_log.go:48 +0xfe
  gvisor.googlesource.com/gvisor/pkg/sentry/kernel.(*Task).DebugDumpState()
      pkg/sentry/kernel/task_log.go:66 +0x11f
  gvisor.googlesource.com/gvisor/pkg/sentry/kernel.(*runApp).execute()
      pkg/sentry/kernel/task_run.go:272 +0xc80
  gvisor.googlesource.com/gvisor/pkg/sentry/kernel.(*Task).run()
      pkg/sentry/kernel/task_run.go:91 +0x24b

Previous write at 0x00c000ba2160 by goroutine 423:
  gvisor.googlesource.com/gvisor/pkg/sentry/fs.Rename()
      pkg/sentry/fs/dirent.go:1628 +0x61f
  gvisor.googlesource.com/gvisor/pkg/sentry/syscalls/linux.renameAt.func1.1()
      pkg/sentry/syscalls/linux/sys_file.go:1864 +0x1f8
  gvisor.googlesource.com/gvisor/pkg/sentry/syscalls/linux.fileOpAt(  gvisor.googlesource.com/g/linux/sys_file.go:51 +0x20f
  gvisor.googlesource.com/gvisor/pkg/sentry/syscalls/linux.renameAt.func1()
      pkg/sentry/syscalls/linux/sys_file.go:1852 +0x218
  gvisor.googlesource.com/gvisor/pkg/sentry/syscalls/linux.fileOpAt()
      pkg/sentry/syscalls/linux/sys_file.go:51 +0x20f
  gvisor.googlesource.com/gvisor/pkg/sentry/syscalls/linux.renameAt()
      pkg/sentry/syscalls/linux/sys_file.go:1840 +0x180
  gvisor.googlesource.com/gvisor/pkg/sentry/syscalls/linux.Rename()
      pkg/sentry/syscalls/linux/sys_file.go:1873 +0x60
  gvisor.googlesource.com/gvisor/pkg/sentry/kernel.(*Task).executeSyscall()
      pkg/sentry/kernel/task_syscall.go:165 +0x17a
  gvisor.googlesource.com/gvisor/pkg/sentry/kernel.(*Task).doSyscallInvoke()
      pkg/sentry/kernel/task_syscall.go:283 +0xb4
  gvisor.googlesource.com/gvisor/pkg/sentry/kernel.(*Task).doSyscallEnter()
      pkg/sentry/kernel/task_syscall.go:244 +0x10c
  gvisor.googlesource.com/gvisor/pkg/sentry/kernel.(*Task).doSyscall()
      pkg/sentry/kernel/task_syscall.go:219 +0x1e3
  gvisor.googlesource.com/gvisor/pkg/sentry/kernel.(*runApp).execute()
      pkg/sentry/kernel/task_run.go:215 +0x15a9
  gvisor.googlesource.com/gvisor/pkg/sentry/kernel.(*Task).run()
      pkg/sentry/kernel/task_run.go:91 +0x24b

Reported-by: syzbot+e1babbf756fab380dfff@syzkaller.appspotmail.com
Change-Id: Icd2620bb3ea28b817bf0672d454a22b9d8ee189a
PiperOrigin-RevId: 242938741
2019-04-10 14:17:33 -07:00
Nicolas Lacasse 0a0619216e Start saving MountSource.DirentCache.
DirentCache is already a savable type, and it ensures that it is empty at the
point of Save.  There is no reason not to save it along with the MountSource.

This did uncover an issue where not all MountSources were properly flushed
before Save.  If a mount point has an open file and is then unmounted, we save
the MountSource without flushing it first.  This CL also fixes that by flushing
all MountSources for all open FDs on Save.

PiperOrigin-RevId: 242906637
Change-Id: I3acd9d52b6ce6b8c989f835a408016cb3e67018f
2019-04-10 11:27:16 -07:00
Shiva Prasanth 7140b1fdca Fixed /proc/cpuinfo permissions
This also applies these permissions to other static proc files.

Change-Id: I4167e585fed49ad271aa4e1f1260babb3239a73d
PiperOrigin-RevId: 242898575
2019-04-10 10:49:43 -07:00
Jamie Liu 9471c01348 Export kernel.SignalInfoPriv.
Also add kernel.SignalInfoNoInfo, and use it in RLIMIT_FSIZE checks.

PiperOrigin-RevId: 242562428
Change-Id: I4887c0e1c8f5fddcabfe6d4281bf76d2f2eafe90
2019-04-08 16:32:11 -07:00
Nicolas Lacasse 70906f1d24 Intermediate ram fs dirs should be writable.
We construct a ramfs tree of "scaffolding" directories for all mount points, so
that a directory exists that each mount point can be mounted over.

We were creating these directories without write permissions, which meant that
they were not wribable even when underlayed under a writable filesystem. They
should be writable.

PiperOrigin-RevId: 242507789
Change-Id: I86645e35417560d862442ff5962da211dbe9b731
2019-04-08 11:56:38 -07:00
Nicolas Lacasse ee7e6d33b2 Use string type for extended attribute values, instead of []byte.
Strings are a better fit for this usage because they are immutable in Go, and
can contain arbitrary bytes. It also allows us to avoid casting bytes to string
(and the associated allocation) in the hot path when checking for overlay
whiteouts.

PiperOrigin-RevId: 242208856
Change-Id: I7699ae6302492eca71787dd0b72e0a5a217a3db2
2019-04-05 15:49:39 -07:00
Andrei Vagin 88409e983c gvisor: Add support for the MS_NOEXEC mount option
https://github.com/google/gvisor/issues/145

PiperOrigin-RevId: 242044115
Change-Id: I8f140fe05e32ecd438b6be218e224e4b7fe05878
2019-04-04 17:43:53 -07:00
Nicolas Lacasse 61d8c361c6 Don't release d.mu in checks for child-existence.
Dirent.exists() is called in Create to check whether a child with the given
name already exists.

Dirent.exists() calls walk(), and before this CL allowed walk() to drop d.mu
while calling d.Inode.Lookup. During this existence check, a racing Rename()
can acquire d.mu and create a new child of the dirent with the same name.
(Note that the source and destination of the rename must be in the same
directory, otherwise renameMu will be taken preventing the race.) In this
case, d.exists() can return false, even though a child with the same name
actually does exist.

This CL changes d.exists() so that it does not release d.mu while walking, thus
preventing the race with Rename.

It also adds comments noting that lockForRename may not take renameMu if the
source and destination are in the same directory, as this is a bit surprising
(at least it was to me).

PiperOrigin-RevId: 241842579
Change-Id: I56524870e39dfcd18cab82054eb3088846c34813
2019-04-03 17:53:56 -07:00
Kevin Krakauer 82529becae Fix index out of bounds in tty implementation.
The previous implementation revolved around runes instead of bytes, which caused
weird behavior when converting between the two. For example, peekRune would read
the byte 0xff from a buffer, convert it to a rune, then return it. As rune is an
alias of int32, 0xff was 0-padded to int32(255), which is the hex code point for
?. However, peekRune also returned the length of the byte (1). When calling
utf8.EncodeRune, we only allocated 1 byte, but tried the write the 2-byte
character ?.

tl;dr: I apparently didn't understand runes when I wrote this.

PiperOrigin-RevId: 241789081
Change-Id: I14c788af4d9754973137801500ef6af7ab8a8727
2019-04-03 13:00:34 -07:00
Kevin Krakauer c79e81bd27 Addresses data race in tty implementation.
Also makes the safemem reading and writing inline, as it makes it easier to see
what locks are held.

PiperOrigin-RevId: 241775201
Change-Id: Ib1072f246773ef2d08b5b9a042eb7e9e0284175c
2019-04-03 11:49:55 -07:00
Nicolas Lacasse 1776ab28f0 Add test that symlinking over a directory returns EEXIST.
Also remove comments in InodeOperations that required that implementation of
some Create* operations ensure that the name does not already exist, since
these checks are all centralized in the Dirent.

PiperOrigin-RevId: 241637335
Change-Id: Id098dc6063ff7c38347af29d1369075ad1e89a58
2019-04-02 17:28:36 -07:00
Wei Zhang 1fcd40719d device: fix device major/minor
Current gvisor doesn't give devices a right major and minor number.

When testing golang supporting of gvisor, I run the test case below:

```
$ docker run -ti --runtime runsc golang:1.12.1 bash -c "cd /usr/local/go/src && ./run.bash "
```

And it reports some errors, one of them is:

"--- FAIL: TestDevices (0.00s)
    --- FAIL: TestDevices//dev/null_1:3 (0.00s)
        dev_linux_test.go:45: for /dev/null Major(0x0) == 0, want 1
        dev_linux_test.go:48: for /dev/null Minor(0x0) == 0, want 3
        dev_linux_test.go:51: for /dev/null Mkdev(1, 3) == 0x103, want 0x0
    --- FAIL: TestDevices//dev/zero_1:5 (0.00s)
        dev_linux_test.go:45: for /dev/zero Major(0x0) == 0, want 1
        dev_linux_test.go:48: for /dev/zero Minor(0x0) == 0, want 5
        dev_linux_test.go:51: for /dev/zero Mkdev(1, 5) == 0x105, want 0x0
    --- FAIL: TestDevices//dev/random_1:8 (0.00s)
        dev_linux_test.go:45: for /dev/random Major(0x0) == 0, want 1
        dev_linux_test.go:48: for /dev/random Minor(0x0) == 0, want 8
        dev_linux_test.go:51: for /dev/random Mkdev(1, 8) == 0x108, want 0x0
    --- FAIL: TestDevices//dev/full_1:7 (0.00s)
        dev_linux_test.go:45: for /dev/full Major(0x0) == 0, want 1
        dev_linux_test.go:48: for /dev/full Minor(0x0) == 0, want 7
        dev_linux_test.go:51: for /dev/full Mkdev(1, 7) == 0x107, want 0x0
    --- FAIL: TestDevices//dev/urandom_1:9 (0.00s)
        dev_linux_test.go:45: for /dev/urandom Major(0x0) == 0, want 1
        dev_linux_test.go:48: for /dev/urandom Minor(0x0) == 0, want 9
        dev_linux_test.go:51: for /dev/urandom Mkdev(1, 9) == 0x109, want 0x0
"

So I think we'd better assign to them correct major/minor numbers following linux spec.

Signed-off-by: Wei Zhang <zhangwei198900@gmail.com>
Change-Id: I4521ee7884b4e214fd3a261929e3b6dac537ada9
PiperOrigin-RevId: 241609021
2019-04-02 14:51:07 -07:00
Andrei Vagin a4b34e2637 gvisor: convert ilist to ilist:generic_list
ilist:generic_list works faster (cl/240185278) and
the code looks cleaner without type casting.
PiperOrigin-RevId: 241381175
Change-Id: I8487ab1d73637b3e9733c253c56dce9e79f0d35f
2019-04-01 12:53:27 -07:00
Jamie Liu 69afd0438e Return srclen in proc.idMapFileOperations.Write.
PiperOrigin-RevId: 241037926
Change-Id: I4b0381ac1c7575e8b861291b068d3da22bc03850
2019-03-29 13:16:46 -07:00
Googler e373d3642e Internal change.
PiperOrigin-RevId: 240842801
Change-Id: Ibbd6f849f9613edc1b1dd7a99a97d1ecdb6e9188
2019-03-28 13:43:47 -07:00
Jamie Liu f005350c93 Clean up gofer handle caching.
- Document fsutil.CachedFileObject.FD() requirements on access
permissions, and change gofer.inodeFileState.FD() to honor them.
Fixes #147.

- Combine gofer.inodeFileState.readonly and
gofer.inodeFileState.readthrough, and simplify handle caching logic.

- Inline gofer.cachePolicy.cacheHandles into
gofer.inodeFileState.setSharedHandles, because users with access to
gofer.inodeFileState don't necessarily have access to the fs.Inode
(predictably, this is a save/restore problem).

Before this CL:

$ docker run --runtime=runsc-d -v $(pwd)/gvisor/repro:/root/repro -it ubuntu bash
root@34d51017ed67:/# /root/repro/runsc-b147
mmap: 0x7f3c01e45000
Segmentation fault

After this CL:

$ docker run --runtime=runsc-d -v $(pwd)/gvisor/repro:/root/repro -it ubuntu bash
root@d3c3cb56bbf9:/# /root/repro/runsc-b147
mmap: 0x7f78987ec000
o
PiperOrigin-RevId: 240818413
Change-Id: I49e1d4a81a0cb9177832b0a9f31a10da722a896b
2019-03-28 11:43:51 -07:00
Nicolas Lacasse 9c18897887 Add rsslim field in /proc/pid/stat.
PiperOrigin-RevId: 240681675
Change-Id: Ib214106e303669fca2d5c744ed5c18e835775161
2019-03-27 17:44:38 -07:00
Nicolas Lacasse 2d355f0e8f Add start time to /proc/<pid>/stat.
The start time is the number of clock ticks between the boot time and
application start time.

PiperOrigin-RevId: 240619475
Change-Id: Ic8bd7a73e36627ed563988864b0c551c052492a5
2019-03-27 12:41:27 -07:00
Nicolas Lacasse 645af7cdd8 Dev device methods should take pointer receiver.
PiperOrigin-RevId: 240600504
Change-Id: I7dd5f27c8da31f24b68b48acdf8f1c19dbd0c32d
2019-03-27 11:08:50 -07:00
Rahat Mahmood 06ec97a3f8 Implement memfd_create.
Memfds are simply anonymous tmpfs files with no associated
mounts. Also implementing file seals, which Linux only implements for
memfds at the moment.

PiperOrigin-RevId: 240450031
Change-Id: I31de78b950101ae8d7a13d0e93fe52d98ea06f2f
2019-03-26 16:16:57 -07:00
Jamie Liu f3723f8059 Call memmap.Mappable.Translate with more conservative usermem.AccessType.
MM.insertPMAsLocked() passes vma.maxPerms to memmap.Mappable.Translate
(although it unsets AccessType.Write if the vma is private). This
somewhat simplifies handling of pmas, since it means only COW-break
needs to replace existing pmas. However, it also means that a MAP_SHARED
mapping of a file opened O_RDWR dirties the file, regardless of the
mapping's permissions and whether or not the mapping is ever actually
written to with I/O that ignores permissions (e.g.
ptrace(PTRACE_POKEDATA)).

To fix this:

- Change the pma-getting path to request only the permissions that are
required for the calling access.

- Change memmap.Mappable.Translate to take requested permissions, and
return allowed permissions. This preserves the existing behavior in the
common cases where the memmap.Mappable isn't
fsutil.CachingInodeOperations and doesn't care if the translated
platform.File pages are written to.

- Change the MM.getPMAsLocked path to support permission upgrading of
pmas outside of copy-on-write.

PiperOrigin-RevId: 240196979
Change-Id: Ie0147c62c1fbc409467a6fa16269a413f3d7d571
2019-03-25 12:42:43 -07:00
Kevin Krakauer 0cd5f20044 Replace manual pty copies to/from userspace with safemem operations.
Also, changing queue.writeBuf from a buffer.Bytes to a [][]byte should reduce
copying and reallocating of slices.

PiperOrigin-RevId: 239713547
Change-Id: I6ee5ff19c3ee2662f1af5749cae7b73db0569e96
2019-03-21 18:05:07 -07:00
Andrei Vagin 87cce0ec08 netstack: reduce MSS from SYN to account tcp options
See: https://tools.ietf.org/html/rfc6691#section-2
PiperOrigin-RevId: 239305632
Change-Id: Ie8eb912a43332e6490045dc95570709c5b81855e
2019-03-19 17:33:20 -07:00