Commit Graph

1119 Commits

Author SHA1 Message Date
Kevin Krakauer c79e81bd27 Addresses data race in tty implementation.
Also makes the safemem reading and writing inline, as it makes it easier to see
what locks are held.

PiperOrigin-RevId: 241775201
Change-Id: Ib1072f246773ef2d08b5b9a042eb7e9e0284175c
2019-04-03 11:49:55 -07:00
Ian Lewis 77f01ee3c7 Add syscall annotations for unimplemented syscalls
Added syscall annotations for unimplemented syscalls for later generation into
reference docs. Annotations are of the form:
@Syscall(<name>, <key:value>, ...)

Supported args and values are:

- arg: A syscall option. This entry only applies to the syscall when given this
       option.
- support: Indicates support level
  - UNIMPLEMENTED: Unimplemented (implies returns:ENOSYS)
  - PARTIAL: Partial support. Details should be provided in note.
  - FULL: Full support
- returns: Indicates a known return value. Values are
           syscall errors. This is treated as a string so you can use something
           like "returns:EPERM or ENOSYS".
- issue: A Github issue number.
- note: A note

Example:
// @Syscall(mmap, arg:MAP_PRIVATE, support:FULL, note:Private memory fully supported)
// @Syscall(mmap, arg:MAP_SHARED, support:UNIMPLEMENTED, issue:123, note:Shared memory not supported)
// @Syscall(setxattr, returns:ENOTSUP, note:Requires file system support)

Annotations should be placed as close to their implementation as possible
(preferrably as part of a supporting function's Godoc) and should be updated as
syscall support changes.

PiperOrigin-RevId: 241697482
Change-Id: I7a846135db124e1271dc5057d788cba82ca312d4
2019-04-03 03:10:23 -07:00
Jamie Liu c4caccd540 Set options on the correct Task in PTRACE_SEIZE.
$ docker run --rm --runtime=runsc -it --cap-add=SYS_PTRACE debian bash -c "apt-get update && apt-get install strace && strace ls"
...
Setting up strace (4.15-2) ...
execve("/bin/ls", ["ls"], [/* 6 vars */]) = 0
brk(NULL)                               = 0x5646d8c1e000
uname({sysname="Linux", nodename="114ef93d2db3", ...}) = 0
...

PiperOrigin-RevId: 241643321
Change-Id: Ie4bce27a7fb147eef07bbae5895c6ef3f529e177
2019-04-02 18:13:19 -07:00
Kevin Krakauer 5c465603b6 Add build rule for raw socket tests so they are runnable via:
bazel test test/syscalls:raw_socket_ipv4_test_{native,runsc_ptrace,runsc_kvm}

PiperOrigin-RevId: 241640049
Change-Id: Iac4dbdd7fd1827399a472059ac7d85fb6b506577
2019-04-02 17:48:33 -07:00
Nicolas Lacasse 1776ab28f0 Add test that symlinking over a directory returns EEXIST.
Also remove comments in InodeOperations that required that implementation of
some Create* operations ensure that the name does not already exist, since
these checks are all centralized in the Dirent.

PiperOrigin-RevId: 241637335
Change-Id: Id098dc6063ff7c38347af29d1369075ad1e89a58
2019-04-02 17:28:36 -07:00
Kevin Krakauer f9431fb20f Remove obsolete TODO.
PiperOrigin-RevId: 241637164
Change-Id: I65476a739cf38f1818dc47f6ce60638dec8b77a8
2019-04-02 17:27:05 -07:00
Rahat Mahmood d14a7de658 Fix more data races in shm debug messages.
PiperOrigin-RevId: 241630409
Change-Id: Ie0df5f5a2f20c2d32e615f16e2ba43c88f963181
2019-04-02 16:46:32 -07:00
Wei Zhang 1fcd40719d device: fix device major/minor
Current gvisor doesn't give devices a right major and minor number.

When testing golang supporting of gvisor, I run the test case below:

```
$ docker run -ti --runtime runsc golang:1.12.1 bash -c "cd /usr/local/go/src && ./run.bash "
```

And it reports some errors, one of them is:

"--- FAIL: TestDevices (0.00s)
    --- FAIL: TestDevices//dev/null_1:3 (0.00s)
        dev_linux_test.go:45: for /dev/null Major(0x0) == 0, want 1
        dev_linux_test.go:48: for /dev/null Minor(0x0) == 0, want 3
        dev_linux_test.go:51: for /dev/null Mkdev(1, 3) == 0x103, want 0x0
    --- FAIL: TestDevices//dev/zero_1:5 (0.00s)
        dev_linux_test.go:45: for /dev/zero Major(0x0) == 0, want 1
        dev_linux_test.go:48: for /dev/zero Minor(0x0) == 0, want 5
        dev_linux_test.go:51: for /dev/zero Mkdev(1, 5) == 0x105, want 0x0
    --- FAIL: TestDevices//dev/random_1:8 (0.00s)
        dev_linux_test.go:45: for /dev/random Major(0x0) == 0, want 1
        dev_linux_test.go:48: for /dev/random Minor(0x0) == 0, want 8
        dev_linux_test.go:51: for /dev/random Mkdev(1, 8) == 0x108, want 0x0
    --- FAIL: TestDevices//dev/full_1:7 (0.00s)
        dev_linux_test.go:45: for /dev/full Major(0x0) == 0, want 1
        dev_linux_test.go:48: for /dev/full Minor(0x0) == 0, want 7
        dev_linux_test.go:51: for /dev/full Mkdev(1, 7) == 0x107, want 0x0
    --- FAIL: TestDevices//dev/urandom_1:9 (0.00s)
        dev_linux_test.go:45: for /dev/urandom Major(0x0) == 0, want 1
        dev_linux_test.go:48: for /dev/urandom Minor(0x0) == 0, want 9
        dev_linux_test.go:51: for /dev/urandom Mkdev(1, 9) == 0x109, want 0x0
"

So I think we'd better assign to them correct major/minor numbers following linux spec.

Signed-off-by: Wei Zhang <zhangwei198900@gmail.com>
Change-Id: I4521ee7884b4e214fd3a261929e3b6dac537ada9
PiperOrigin-RevId: 241609021
2019-04-02 14:51:07 -07:00
Kevin Krakauer a40ee4f4b8 Change bug number for duplicate bug.
PiperOrigin-RevId: 241567897
Change-Id: I580eac04f52bb15f4aab7df9822c4aa92e743021
2019-04-02 11:28:06 -07:00
Kevin Krakauer 52a51a8e20 Add a raw socket transport endpoint and use it for raw ICMP sockets.
Having raw socket code together will make it easier to add support for other raw
network protocols. Currently, only ICMP uses the raw endpoint. However, adding
support for other protocols such as UDP shouldn't be much more difficult than
adding a few switch cases.

PiperOrigin-RevId: 241564875
Change-Id: I77e03adafe4ce0fd29ba2d5dfdc547d2ae8f25bf
2019-04-02 11:13:49 -07:00
Fabricio Voznika 1df3fa6997 Automated rollback of changelist 240657604
PiperOrigin-RevId: 241434161
Change-Id: I9ec734e50cef5b39203e8bf37de2d91d24943f1e
2019-04-01 17:30:11 -07:00
Adin Scannell 7543e9ec20 Add release hook and version flag
PiperOrigin-RevId: 241421671
Change-Id: Ic0cebfe3efd458dc42c49f7f812c13318705199a
2019-04-01 16:18:43 -07:00
Rahat Mahmood 7cff746ef2 Save/restore simple devices.
We weren't saving simple devices' last allocated inode numbers, which
caused inode number reuse across S/R.

PiperOrigin-RevId: 241414245
Change-Id: I964289978841ef0a57d2fa48daf8eab7633c1284
2019-04-01 15:39:16 -07:00
Jamie Liu 1a02ba3e6e Trim trailing newline when reading /proc/[pid]/{uid,gid}_map in test.
This reveals a bug in the tests that require CAP_SET{UID,GID}: After the
child process enters the new user namespace, it ceases to have the
relevant capability in the parent user namespace, so the privileged
write must be done by the parent process. Change tests accordingly.

PiperOrigin-RevId: 241412765
Change-Id: I587c1f24aa6f2180fb2e5e5c0162691ba5bac1bc
2019-04-01 15:31:37 -07:00
Liu Hua 33c644bc0b gofer: ignore unsupported files
'ls' will hang if there is any FIFO in this path. So
return EPERM if unsupported file occurs and add NONBLOCK flag
when opening file to avoid blocking on FIFO read.

Signed-off-by: Liu Hua <sdu.liu@huawei.com>
Change-Id: I8b9a2a48322118d8ad531dd226395438123eb047
PiperOrigin-RevId: 241406726
2019-04-01 15:01:53 -07:00
Jamie Liu b4006686d2 Don't expand COW-break on executable VMAs.
PiperOrigin-RevId: 241403847
Change-Id: I4631ca05734142da6e80cdfa1a1d63ed68aa05cc
2019-04-01 14:47:31 -07:00
Andrei Vagin a4b34e2637 gvisor: convert ilist to ilist:generic_list
ilist:generic_list works faster (cl/240185278) and
the code looks cleaner without type casting.
PiperOrigin-RevId: 241381175
Change-Id: I8487ab1d73637b3e9733c253c56dce9e79f0d35f
2019-04-01 12:53:27 -07:00
Googler 0327931ca4 Internal change.
PiperOrigin-RevId: 241350917
Change-Id: Ieacaa9ce2e41e22f1bae8900170879f549606782
2019-04-01 10:29:20 -07:00
Jamie Liu 60efd53822 Fix MemfdTest_OtherProcessCanOpenFromProcfs.
- Make the body of InForkedProcess async-signal-safe.

- Pass the correct path to open().

PiperOrigin-RevId: 241348774
Change-Id: I753dfa36e4fb05521e659c173e3b7db0c7fc159b
2019-04-01 10:18:36 -07:00
Andrei Vagin a046054ba3 gvisor/runsc: enable generic segmentation offload (GSO)
The linux packet socket can handle GSO packets, so we can segment packets to
64K instead of the MTU which is usually 1500.

Here are numbers for the nginx-1m test:
runsc:		579330.01 [Kbytes/sec] received
runsc-gso:	1794121.66 [Kbytes/sec] received
runc:		2122139.06 [Kbytes/sec] received

and for tcp_benchmark:

$ tcp_benchmark  --duration 15   --ideal
[  4]  0.0-15.0 sec  86647 MBytes  48456 Mbits/sec

$ tcp_benchmark --client --duration 15   --ideal
[  4]  0.0-15.0 sec  2173 MBytes  1214 Mbits/sec

$ tcp_benchmark --client --duration 15   --ideal --gso 65536
[  4]  0.0-15.0 sec  19357 MBytes  10825 Mbits/sec

PiperOrigin-RevId: 241072403
Change-Id: I20b03063a1a6649362b43609cbbc9b59be06e6d5
2019-03-29 16:27:38 -07:00
Jamie Liu 26e8d9981f Use kernel.Task.CopyScratchBuffer in syscalls/linux where possible.
PiperOrigin-RevId: 241072126
Change-Id: Ib4d9f58f550732ac4c5153d3cf159a5b1a9749da
2019-03-29 16:25:33 -07:00
Nicolas Lacasse dcf6613331 Set container.CreatedAt in Create().
PiperOrigin-RevId: 241056805
Change-Id: I13ea8f5dbfb01ca02a3b0ab887b8c3bdf4d556a6
2019-03-29 14:55:22 -07:00
Nicolas Lacasse e8fef3d873 Treat fsync errors during save as SaveRejection errors.
PiperOrigin-RevId: 241055485
Change-Id: I70259e9fef59bdf9733b35a2cd3319359449dd45
2019-03-29 14:48:16 -07:00
Michael Pratt d11ef20a93 Drop reference on shared anon mappable
We call NewSharedAnonMappable simply to use it for Mappable/MappingIdentity for
shared anon mmap. From MMapOpts.MappingIdentity: "If MMapOpts is used to
successfully create a memory mapping, a reference is taken on MappingIdentity."

mm.createVMALocked (below) takes this additional reference, so we don't need
the reference returned by NewSharedAnonMappable. Holding it leaks the mappable.

PiperOrigin-RevId: 241038108
Change-Id: I78ee3af78e0cc7aac4063b274b30d0e41eb5677d
2019-03-29 13:17:56 -07:00
Jamie Liu 69afd0438e Return srclen in proc.idMapFileOperations.Write.
PiperOrigin-RevId: 241037926
Change-Id: I4b0381ac1c7575e8b861291b068d3da22bc03850
2019-03-29 13:16:46 -07:00
Nicolas Lacasse ed23f54709 Treat ENOSPC as a state-file error during save.
PiperOrigin-RevId: 241028806
Change-Id: I770bf751a2740869a93c3ab50370a727ae580470
2019-03-29 12:26:25 -07:00
Bhasker Hariharan 45c54b1f4e Fix incorrect checksums in TCP and UDP tests.
PiperOrigin-RevId: 241025361
Change-Id: I292e7aea9a4b294b11e4f736e107010d9524586b
2019-03-29 12:05:43 -07:00
Bhasker Hariharan cc0e96a4bd Fix Panic in SACKScoreboard.Delete.
The panic was caused by modifying the tree while iterating which invalidated the
iterator.

Also fixes another bug in SACKScoreboard.Insert() which was causing blocks to be
merged incorrectly.

PiperOrigin-RevId: 240895053
Change-Id: Ia72b8244297962df5c04283346da5226434740af
2019-03-28 18:18:39 -07:00
chris.zn 31c2236e97 set task's name when fork
When fork a child process, the name filed of TaskContext is not set.
It results in that when we cat /proc/{pid}/status, the name filed is
null.

Like this:
Name:
State:  S (sleeping)
Tgid:   28
Pid:    28
PPid:   26
TracerPid:      0
FDSize: 8
VmSize: 89712 kB
VmRSS:  6648 kB
Threads:        1
CapInh: 00000000a93d35fb
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 00000000a93d35fb
Seccomp:        0
Change-Id: I5d469098c37cedd19da16b7ffab2e546a28a321e
PiperOrigin-RevId: 240893304
2019-03-28 18:05:42 -07:00
Nicolas Lacasse 99195b0e16 Setting timestamps should trigger an inotify event.
PiperOrigin-RevId: 240850187
Change-Id: I1458581b771a1031e47bba439e480829794927b8
2019-03-28 14:15:23 -07:00
Bert Muthalaly f2e5dcf21c Add ICMP stats
PiperOrigin-RevId: 240848882
Change-Id: I23dd4599f073263437aeab357c3f767e1a432b82
2019-03-28 14:09:20 -07:00
Googler e373d3642e Internal change.
PiperOrigin-RevId: 240842801
Change-Id: Ibbd6f849f9613edc1b1dd7a99a97d1ecdb6e9188
2019-03-28 13:43:47 -07:00
Jamie Liu f005350c93 Clean up gofer handle caching.
- Document fsutil.CachedFileObject.FD() requirements on access
permissions, and change gofer.inodeFileState.FD() to honor them.
Fixes #147.

- Combine gofer.inodeFileState.readonly and
gofer.inodeFileState.readthrough, and simplify handle caching logic.

- Inline gofer.cachePolicy.cacheHandles into
gofer.inodeFileState.setSharedHandles, because users with access to
gofer.inodeFileState don't necessarily have access to the fs.Inode
(predictably, this is a save/restore problem).

Before this CL:

$ docker run --runtime=runsc-d -v $(pwd)/gvisor/repro:/root/repro -it ubuntu bash
root@34d51017ed67:/# /root/repro/runsc-b147
mmap: 0x7f3c01e45000
Segmentation fault

After this CL:

$ docker run --runtime=runsc-d -v $(pwd)/gvisor/repro:/root/repro -it ubuntu bash
root@d3c3cb56bbf9:/# /root/repro/runsc-b147
mmap: 0x7f78987ec000
o
PiperOrigin-RevId: 240818413
Change-Id: I49e1d4a81a0cb9177832b0a9f31a10da722a896b
2019-03-28 11:43:51 -07:00
Liu Hua 1d7e2bc377 gofer: some fixs in setupRootFS
1.use root instead of spec.Root.path as mountpoint
2.put remount readonly logic ahead to avoid device busy errors

Signed-off-by: Liu Hua <sdu.liu@huawei.com>
Change-Id: I9222b4695f917136a97b0898ac6f75fcff296e5d
PiperOrigin-RevId: 240818182
2019-03-28 11:42:41 -07:00
Andrei Vagin f4105ac21a netstack/fdbased: add generic segmentation offload (GSO) support
The linux packet socket can handle GSO packets, so we can segment packets to
64K instead of the MTU which is usually 1500.

Here are numbers for the nginx-1m test:
runsc:		579330.01 [Kbytes/sec] received
runsc-gso:	1794121.66 [Kbytes/sec] received
runc:		2122139.06 [Kbytes/sec] received

and for tcp_benchmark:

$ tcp_benchmark  --duration 15   --ideal
[  4]  0.0-15.0 sec  86647 MBytes  48456 Mbits/sec

$ tcp_benchmark --client --duration 15   --ideal
[  4]  0.0-15.0 sec  2173 MBytes  1214 Mbits/sec

$ tcp_benchmark --client --duration 15   --ideal --gso 65536
[  4]  0.0-15.0 sec  19357 MBytes  10825 Mbits/sec

PiperOrigin-RevId: 240809103
Change-Id: I2637f104db28b5d4c64e1e766c610162a195775a
2019-03-28 11:03:41 -07:00
Nicolas Lacasse 9c18897887 Add rsslim field in /proc/pid/stat.
PiperOrigin-RevId: 240681675
Change-Id: Ib214106e303669fca2d5c744ed5c18e835775161
2019-03-27 17:44:38 -07:00
Fabricio Voznika 6cb0b1881a Automated rollback of changelist 240502097
PiperOrigin-RevId: 240657604
Change-Id: Ida15dee83337867c560427eae0b4b9ce1051dbb8
2019-03-27 15:46:49 -07:00
Tamir Duberstein 8406504817 Avoid mutating memory passed to DeliverTransportPacket
PiperOrigin-RevId: 240642903
Change-Id: I16625015123a827d267d60b328a202057264bbd6
2019-03-27 14:36:57 -07:00
Nicolas Lacasse 2d355f0e8f Add start time to /proc/<pid>/stat.
The start time is the number of clock ticks between the boot time and
application start time.

PiperOrigin-RevId: 240619475
Change-Id: Ic8bd7a73e36627ed563988864b0c551c052492a5
2019-03-27 12:41:27 -07:00
Andrei Vagin 5d94c893ae gvisor/runsc: address typos from github
Fixes: https://github.com/google/gvisor/issues/143
Fixes #143
PiperOrigin-RevId: 240600719
Change-Id: Id1731b9969f98e32e52e144a6643e12b0b70f168
2019-03-27 11:10:15 -07:00
Nicolas Lacasse 645af7cdd8 Dev device methods should take pointer receiver.
PiperOrigin-RevId: 240600504
Change-Id: I7dd5f27c8da31f24b68b48acdf8f1c19dbd0c32d
2019-03-27 11:08:50 -07:00
Googler 66181f3de9 Add //tools/cpp:cc_flags to the toolchains attribute.
This is so that CC_FLAGS will be resolved properly.

After the --incompatible_disable_genrule_cc_toolchain_dependency flag is
flipped, Bazel will no longer be providing CC_FLAGS to genrule by default.

PiperOrigin-RevId: 240595715
Change-Id: I067334051e89f7ec006a6b6b3d2f4188911ac2db
2019-03-27 10:50:02 -07:00
Jamie Liu 26583e413e Convert []byte to string without copying in usermem.CopyStringIn.
This is the same technique used by Go's strings.Builder
(https://golang.org/src/strings/builder.go#L45), and for the same
reason. (We can't just use strings.Builder because there's no way to get
the underlying []byte to pass to usermem.IO.CopyIn.)

PiperOrigin-RevId: 240594892
Change-Id: Ic070e7e480aee53a71289c7c120850991358c52c
2019-03-27 10:46:28 -07:00
Fabricio Voznika beb71ab681 Merge fsgofer 'controlFile' and 'openedFile'
This reduces the number of FDs used for writable files.

#149

PiperOrigin-RevId: 240502097
Change-Id: Ib44489f65bce23dd1a995f620d69e65dce003f7c
2019-03-26 23:44:34 -07:00
Tamir Duberstein 9c20a88bd7 Remove polling from ICMP test
PiperOrigin-RevId: 240483396
Change-Id: Ie75d3ae38af83f1d92f167ff9ba58fa10f5b372b
2019-03-26 20:20:52 -07:00
Michael Pratt e9152d4a62 Automated rollback of changelist 234892473
PiperOrigin-RevId: 240462667
Change-Id: I3d1c5c0d80a3badced963ae1d450c20ed8a767ed
2019-03-26 17:27:48 -07:00
Andrei Vagin 654e878abb netstack: Don't exclude length when a pseudo-header checksum is calculated
This is a preparation for GSO changes (cl/234508902).

RELNOTES[gofers]: Refactor checksum code to include length, which
it already did, but in a convoluted way. Should be a no-op.

PiperOrigin-RevId: 240460794
Change-Id: I537381bc670b5a9f5d70a87aa3eb7252e8f5ace2
2019-03-26 17:15:13 -07:00
Rahat Mahmood 06ec97a3f8 Implement memfd_create.
Memfds are simply anonymous tmpfs files with no associated
mounts. Also implementing file seals, which Linux only implements for
memfds at the moment.

PiperOrigin-RevId: 240450031
Change-Id: I31de78b950101ae8d7a13d0e93fe52d98ea06f2f
2019-03-26 16:16:57 -07:00
Andrei Vagin 79aca14a0c Use toolchain configs from bazel_0.23.0
bazel 0.24.0 isn't compatible with bazel_0.20.0 configs:
(10:32:27) ERROR:
bazel_toolchains/configs/ubuntu16_04_clang/1.1/bazel_0.20.0/default/BUILD:57:1:
no such attribute 'dynamic_runtime_libs' in 'cc_toolchain' rule

PiperOrigin-RevId: 240436868
Change-Id: Iee68c9b79d907ca2bdd124386aaa77c786e089ce
2019-03-26 15:10:49 -07:00
Tamir Duberstein 9cd2b66f10 Remove echoReplier
Mirror the ICMPv6 echo implementation in ICMPv4 echo. This removes
unnecessary asynchrony, reduces copying, and reduces complexity.

PiperOrigin-RevId: 240394525
Change-Id: If8f53254154f86772f5e51159765aa23b3b328b8
2019-03-26 11:45:01 -07:00