Commit Graph

1106 Commits

Author SHA1 Message Date
Jamie Liu 1a02ba3e6e Trim trailing newline when reading /proc/[pid]/{uid,gid}_map in test.
This reveals a bug in the tests that require CAP_SET{UID,GID}: After the
child process enters the new user namespace, it ceases to have the
relevant capability in the parent user namespace, so the privileged
write must be done by the parent process. Change tests accordingly.

PiperOrigin-RevId: 241412765
Change-Id: I587c1f24aa6f2180fb2e5e5c0162691ba5bac1bc
2019-04-01 15:31:37 -07:00
Liu Hua 33c644bc0b gofer: ignore unsupported files
'ls' will hang if there is any FIFO in this path. So
return EPERM if unsupported file occurs and add NONBLOCK flag
when opening file to avoid blocking on FIFO read.

Signed-off-by: Liu Hua <sdu.liu@huawei.com>
Change-Id: I8b9a2a48322118d8ad531dd226395438123eb047
PiperOrigin-RevId: 241406726
2019-04-01 15:01:53 -07:00
Jamie Liu b4006686d2 Don't expand COW-break on executable VMAs.
PiperOrigin-RevId: 241403847
Change-Id: I4631ca05734142da6e80cdfa1a1d63ed68aa05cc
2019-04-01 14:47:31 -07:00
Andrei Vagin a4b34e2637 gvisor: convert ilist to ilist:generic_list
ilist:generic_list works faster (cl/240185278) and
the code looks cleaner without type casting.
PiperOrigin-RevId: 241381175
Change-Id: I8487ab1d73637b3e9733c253c56dce9e79f0d35f
2019-04-01 12:53:27 -07:00
Googler 0327931ca4 Internal change.
PiperOrigin-RevId: 241350917
Change-Id: Ieacaa9ce2e41e22f1bae8900170879f549606782
2019-04-01 10:29:20 -07:00
Jamie Liu 60efd53822 Fix MemfdTest_OtherProcessCanOpenFromProcfs.
- Make the body of InForkedProcess async-signal-safe.

- Pass the correct path to open().

PiperOrigin-RevId: 241348774
Change-Id: I753dfa36e4fb05521e659c173e3b7db0c7fc159b
2019-04-01 10:18:36 -07:00
Andrei Vagin a046054ba3 gvisor/runsc: enable generic segmentation offload (GSO)
The linux packet socket can handle GSO packets, so we can segment packets to
64K instead of the MTU which is usually 1500.

Here are numbers for the nginx-1m test:
runsc:		579330.01 [Kbytes/sec] received
runsc-gso:	1794121.66 [Kbytes/sec] received
runc:		2122139.06 [Kbytes/sec] received

and for tcp_benchmark:

$ tcp_benchmark  --duration 15   --ideal
[  4]  0.0-15.0 sec  86647 MBytes  48456 Mbits/sec

$ tcp_benchmark --client --duration 15   --ideal
[  4]  0.0-15.0 sec  2173 MBytes  1214 Mbits/sec

$ tcp_benchmark --client --duration 15   --ideal --gso 65536
[  4]  0.0-15.0 sec  19357 MBytes  10825 Mbits/sec

PiperOrigin-RevId: 241072403
Change-Id: I20b03063a1a6649362b43609cbbc9b59be06e6d5
2019-03-29 16:27:38 -07:00
Jamie Liu 26e8d9981f Use kernel.Task.CopyScratchBuffer in syscalls/linux where possible.
PiperOrigin-RevId: 241072126
Change-Id: Ib4d9f58f550732ac4c5153d3cf159a5b1a9749da
2019-03-29 16:25:33 -07:00
Nicolas Lacasse dcf6613331 Set container.CreatedAt in Create().
PiperOrigin-RevId: 241056805
Change-Id: I13ea8f5dbfb01ca02a3b0ab887b8c3bdf4d556a6
2019-03-29 14:55:22 -07:00
Nicolas Lacasse e8fef3d873 Treat fsync errors during save as SaveRejection errors.
PiperOrigin-RevId: 241055485
Change-Id: I70259e9fef59bdf9733b35a2cd3319359449dd45
2019-03-29 14:48:16 -07:00
Michael Pratt d11ef20a93 Drop reference on shared anon mappable
We call NewSharedAnonMappable simply to use it for Mappable/MappingIdentity for
shared anon mmap. From MMapOpts.MappingIdentity: "If MMapOpts is used to
successfully create a memory mapping, a reference is taken on MappingIdentity."

mm.createVMALocked (below) takes this additional reference, so we don't need
the reference returned by NewSharedAnonMappable. Holding it leaks the mappable.

PiperOrigin-RevId: 241038108
Change-Id: I78ee3af78e0cc7aac4063b274b30d0e41eb5677d
2019-03-29 13:17:56 -07:00
Jamie Liu 69afd0438e Return srclen in proc.idMapFileOperations.Write.
PiperOrigin-RevId: 241037926
Change-Id: I4b0381ac1c7575e8b861291b068d3da22bc03850
2019-03-29 13:16:46 -07:00
Nicolas Lacasse ed23f54709 Treat ENOSPC as a state-file error during save.
PiperOrigin-RevId: 241028806
Change-Id: I770bf751a2740869a93c3ab50370a727ae580470
2019-03-29 12:26:25 -07:00
Bhasker Hariharan 45c54b1f4e Fix incorrect checksums in TCP and UDP tests.
PiperOrigin-RevId: 241025361
Change-Id: I292e7aea9a4b294b11e4f736e107010d9524586b
2019-03-29 12:05:43 -07:00
Bhasker Hariharan cc0e96a4bd Fix Panic in SACKScoreboard.Delete.
The panic was caused by modifying the tree while iterating which invalidated the
iterator.

Also fixes another bug in SACKScoreboard.Insert() which was causing blocks to be
merged incorrectly.

PiperOrigin-RevId: 240895053
Change-Id: Ia72b8244297962df5c04283346da5226434740af
2019-03-28 18:18:39 -07:00
chris.zn 31c2236e97 set task's name when fork
When fork a child process, the name filed of TaskContext is not set.
It results in that when we cat /proc/{pid}/status, the name filed is
null.

Like this:
Name:
State:  S (sleeping)
Tgid:   28
Pid:    28
PPid:   26
TracerPid:      0
FDSize: 8
VmSize: 89712 kB
VmRSS:  6648 kB
Threads:        1
CapInh: 00000000a93d35fb
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 00000000a93d35fb
Seccomp:        0
Change-Id: I5d469098c37cedd19da16b7ffab2e546a28a321e
PiperOrigin-RevId: 240893304
2019-03-28 18:05:42 -07:00
Nicolas Lacasse 99195b0e16 Setting timestamps should trigger an inotify event.
PiperOrigin-RevId: 240850187
Change-Id: I1458581b771a1031e47bba439e480829794927b8
2019-03-28 14:15:23 -07:00
Bert Muthalaly f2e5dcf21c Add ICMP stats
PiperOrigin-RevId: 240848882
Change-Id: I23dd4599f073263437aeab357c3f767e1a432b82
2019-03-28 14:09:20 -07:00
Googler e373d3642e Internal change.
PiperOrigin-RevId: 240842801
Change-Id: Ibbd6f849f9613edc1b1dd7a99a97d1ecdb6e9188
2019-03-28 13:43:47 -07:00
Jamie Liu f005350c93 Clean up gofer handle caching.
- Document fsutil.CachedFileObject.FD() requirements on access
permissions, and change gofer.inodeFileState.FD() to honor them.
Fixes #147.

- Combine gofer.inodeFileState.readonly and
gofer.inodeFileState.readthrough, and simplify handle caching logic.

- Inline gofer.cachePolicy.cacheHandles into
gofer.inodeFileState.setSharedHandles, because users with access to
gofer.inodeFileState don't necessarily have access to the fs.Inode
(predictably, this is a save/restore problem).

Before this CL:

$ docker run --runtime=runsc-d -v $(pwd)/gvisor/repro:/root/repro -it ubuntu bash
root@34d51017ed67:/# /root/repro/runsc-b147
mmap: 0x7f3c01e45000
Segmentation fault

After this CL:

$ docker run --runtime=runsc-d -v $(pwd)/gvisor/repro:/root/repro -it ubuntu bash
root@d3c3cb56bbf9:/# /root/repro/runsc-b147
mmap: 0x7f78987ec000
o
PiperOrigin-RevId: 240818413
Change-Id: I49e1d4a81a0cb9177832b0a9f31a10da722a896b
2019-03-28 11:43:51 -07:00
Liu Hua 1d7e2bc377 gofer: some fixs in setupRootFS
1.use root instead of spec.Root.path as mountpoint
2.put remount readonly logic ahead to avoid device busy errors

Signed-off-by: Liu Hua <sdu.liu@huawei.com>
Change-Id: I9222b4695f917136a97b0898ac6f75fcff296e5d
PiperOrigin-RevId: 240818182
2019-03-28 11:42:41 -07:00
Andrei Vagin f4105ac21a netstack/fdbased: add generic segmentation offload (GSO) support
The linux packet socket can handle GSO packets, so we can segment packets to
64K instead of the MTU which is usually 1500.

Here are numbers for the nginx-1m test:
runsc:		579330.01 [Kbytes/sec] received
runsc-gso:	1794121.66 [Kbytes/sec] received
runc:		2122139.06 [Kbytes/sec] received

and for tcp_benchmark:

$ tcp_benchmark  --duration 15   --ideal
[  4]  0.0-15.0 sec  86647 MBytes  48456 Mbits/sec

$ tcp_benchmark --client --duration 15   --ideal
[  4]  0.0-15.0 sec  2173 MBytes  1214 Mbits/sec

$ tcp_benchmark --client --duration 15   --ideal --gso 65536
[  4]  0.0-15.0 sec  19357 MBytes  10825 Mbits/sec

PiperOrigin-RevId: 240809103
Change-Id: I2637f104db28b5d4c64e1e766c610162a195775a
2019-03-28 11:03:41 -07:00
Nicolas Lacasse 9c18897887 Add rsslim field in /proc/pid/stat.
PiperOrigin-RevId: 240681675
Change-Id: Ib214106e303669fca2d5c744ed5c18e835775161
2019-03-27 17:44:38 -07:00
Fabricio Voznika 6cb0b1881a Automated rollback of changelist 240502097
PiperOrigin-RevId: 240657604
Change-Id: Ida15dee83337867c560427eae0b4b9ce1051dbb8
2019-03-27 15:46:49 -07:00
Tamir Duberstein 8406504817 Avoid mutating memory passed to DeliverTransportPacket
PiperOrigin-RevId: 240642903
Change-Id: I16625015123a827d267d60b328a202057264bbd6
2019-03-27 14:36:57 -07:00
Nicolas Lacasse 2d355f0e8f Add start time to /proc/<pid>/stat.
The start time is the number of clock ticks between the boot time and
application start time.

PiperOrigin-RevId: 240619475
Change-Id: Ic8bd7a73e36627ed563988864b0c551c052492a5
2019-03-27 12:41:27 -07:00
Andrei Vagin 5d94c893ae gvisor/runsc: address typos from github
Fixes: https://github.com/google/gvisor/issues/143
Fixes #143
PiperOrigin-RevId: 240600719
Change-Id: Id1731b9969f98e32e52e144a6643e12b0b70f168
2019-03-27 11:10:15 -07:00
Nicolas Lacasse 645af7cdd8 Dev device methods should take pointer receiver.
PiperOrigin-RevId: 240600504
Change-Id: I7dd5f27c8da31f24b68b48acdf8f1c19dbd0c32d
2019-03-27 11:08:50 -07:00
Googler 66181f3de9 Add //tools/cpp:cc_flags to the toolchains attribute.
This is so that CC_FLAGS will be resolved properly.

After the --incompatible_disable_genrule_cc_toolchain_dependency flag is
flipped, Bazel will no longer be providing CC_FLAGS to genrule by default.

PiperOrigin-RevId: 240595715
Change-Id: I067334051e89f7ec006a6b6b3d2f4188911ac2db
2019-03-27 10:50:02 -07:00
Jamie Liu 26583e413e Convert []byte to string without copying in usermem.CopyStringIn.
This is the same technique used by Go's strings.Builder
(https://golang.org/src/strings/builder.go#L45), and for the same
reason. (We can't just use strings.Builder because there's no way to get
the underlying []byte to pass to usermem.IO.CopyIn.)

PiperOrigin-RevId: 240594892
Change-Id: Ic070e7e480aee53a71289c7c120850991358c52c
2019-03-27 10:46:28 -07:00
Fabricio Voznika beb71ab681 Merge fsgofer 'controlFile' and 'openedFile'
This reduces the number of FDs used for writable files.

#149

PiperOrigin-RevId: 240502097
Change-Id: Ib44489f65bce23dd1a995f620d69e65dce003f7c
2019-03-26 23:44:34 -07:00
Tamir Duberstein 9c20a88bd7 Remove polling from ICMP test
PiperOrigin-RevId: 240483396
Change-Id: Ie75d3ae38af83f1d92f167ff9ba58fa10f5b372b
2019-03-26 20:20:52 -07:00
Michael Pratt e9152d4a62 Automated rollback of changelist 234892473
PiperOrigin-RevId: 240462667
Change-Id: I3d1c5c0d80a3badced963ae1d450c20ed8a767ed
2019-03-26 17:27:48 -07:00
Andrei Vagin 654e878abb netstack: Don't exclude length when a pseudo-header checksum is calculated
This is a preparation for GSO changes (cl/234508902).

RELNOTES[gofers]: Refactor checksum code to include length, which
it already did, but in a convoluted way. Should be a no-op.

PiperOrigin-RevId: 240460794
Change-Id: I537381bc670b5a9f5d70a87aa3eb7252e8f5ace2
2019-03-26 17:15:13 -07:00
Rahat Mahmood 06ec97a3f8 Implement memfd_create.
Memfds are simply anonymous tmpfs files with no associated
mounts. Also implementing file seals, which Linux only implements for
memfds at the moment.

PiperOrigin-RevId: 240450031
Change-Id: I31de78b950101ae8d7a13d0e93fe52d98ea06f2f
2019-03-26 16:16:57 -07:00
Andrei Vagin 79aca14a0c Use toolchain configs from bazel_0.23.0
bazel 0.24.0 isn't compatible with bazel_0.20.0 configs:
(10:32:27) ERROR:
bazel_toolchains/configs/ubuntu16_04_clang/1.1/bazel_0.20.0/default/BUILD:57:1:
no such attribute 'dynamic_runtime_libs' in 'cc_toolchain' rule

PiperOrigin-RevId: 240436868
Change-Id: Iee68c9b79d907ca2bdd124386aaa77c786e089ce
2019-03-26 15:10:49 -07:00
Tamir Duberstein 9cd2b66f10 Remove echoReplier
Mirror the ICMPv6 echo implementation in ICMPv4 echo. This removes
unnecessary asynchrony, reduces copying, and reduces complexity.

PiperOrigin-RevId: 240394525
Change-Id: If8f53254154f86772f5e51159765aa23b3b328b8
2019-03-26 11:45:01 -07:00
Tamir Duberstein 23a5306b5c Resolve stringer TODO
PiperOrigin-RevId: 240224782
Change-Id: Iab4e4e7047b2d022f15e807c2348685d8e972020
2019-03-25 14:59:58 -07:00
Jamie Liu f3723f8059 Call memmap.Mappable.Translate with more conservative usermem.AccessType.
MM.insertPMAsLocked() passes vma.maxPerms to memmap.Mappable.Translate
(although it unsets AccessType.Write if the vma is private). This
somewhat simplifies handling of pmas, since it means only COW-break
needs to replace existing pmas. However, it also means that a MAP_SHARED
mapping of a file opened O_RDWR dirties the file, regardless of the
mapping's permissions and whether or not the mapping is ever actually
written to with I/O that ignores permissions (e.g.
ptrace(PTRACE_POKEDATA)).

To fix this:

- Change the pma-getting path to request only the permissions that are
required for the calling access.

- Change memmap.Mappable.Translate to take requested permissions, and
return allowed permissions. This preserves the existing behavior in the
common cases where the memmap.Mappable isn't
fsutil.CachingInodeOperations and doesn't care if the translated
platform.File pages are written to.

- Change the MM.getPMAsLocked path to support permission upgrading of
pmas outside of copy-on-write.

PiperOrigin-RevId: 240196979
Change-Id: Ie0147c62c1fbc409467a6fa16269a413f3d7d571
2019-03-25 12:42:43 -07:00
Andrei Vagin ddc05e3053 epoll: use ilist:generic_list instead of ilist:ilist
ilist:generic_list works faster than ilist:ilist.

Here is a beanchmark test to measure performance of epoll_wait, when readyList
isn't empty. It shows about 30% better performance with these changes.

Benchmark           Time(ns)        CPU(ns)     Iterations
Before:
BM_EpollAllEvents      46725          46899          14286

After:
BM_EpollAllEvents      33167          33300          18919
PiperOrigin-RevId: 240185278
Change-Id: I3e33f9b214db13ab840b91613400525de5b58d18
2019-03-25 11:41:50 -07:00
Nicolas Lacasse b81bfd6013 lstat should resolve the final path component if it ends in a slash.
PiperOrigin-RevId: 239896221
Change-Id: I0949981fe50c57131c5631cdeb10b225648575c0
2019-03-22 17:38:13 -07:00
Jamie Liu 3d0b960112 Implement PTRACE_SEIZE, PTRACE_INTERRUPT, and PTRACE_LISTEN.
PiperOrigin-RevId: 239803092
Change-Id: I42d612ed6a889e011e8474538958c6de90c6fcab
2019-03-22 08:55:44 -07:00
Yong He 45ba52f824 Allow BP and OF can be called from user space
Change the DPL from 0 to 3 for Breakpoint and Overflow,
then user space could trigger Breakpoint and Overflow
as excepected.

Change-Id: Ibead65fb8c98b32b7737f316db93b3a8d9dcd648
PiperOrigin-RevId: 239736648
2019-03-21 22:04:50 -07:00
Ian Gudger 7d0227ff16 Add test for short recvmsg iovec length.
PiperOrigin-RevId: 239718991
Change-Id: Idc78557a8e9bfdd3cb7d8ec4db708364652640a4
2019-03-21 18:53:17 -07:00
Ian Gudger 125d3a19e3 Test TCP sockets with MSG_TRUNC|MSG_PEEK.
PiperOrigin-RevId: 239714368
Change-Id: I35860b880a1d8885eb8c2d4ff267caaf72d91088
2019-03-21 18:11:22 -07:00
Kevin Krakauer 0cd5f20044 Replace manual pty copies to/from userspace with safemem operations.
Also, changing queue.writeBuf from a buffer.Bytes to a [][]byte should reduce
copying and reallocating of slices.

PiperOrigin-RevId: 239713547
Change-Id: I6ee5ff19c3ee2662f1af5749cae7b73db0569e96
2019-03-21 18:05:07 -07:00
Ian Gudger ba828233b9 Clear msghdr flags on successful recvmsg.
.net sets these flags to -1 and then uses their result, especting it to be
zero.

Does not set actual flags (e.g. MSG_TRUNC), but setting to zero is more correct
than what we did before.

PiperOrigin-RevId: 239657951
Change-Id: I89c5f84bc9b94a2cd8ff84e8ecfea09e01142030
2019-03-21 13:19:11 -07:00
Kevin Krakauer ba937d74f9 Address typos from github.
https://github.com/google/gvisor/pull/132

PiperOrigin-RevId: 239641377
Change-Id: I7ba6b57730800cc98496c83cb643e70ec902ed3d
2019-03-21 11:54:10 -07:00
Andrei Vagin 064fda1a75 gvisor: don't allocate a new credential object on fork
A credential object is immutable, so we don't need to copy it for a new
task.

PiperOrigin-RevId: 239519266
Change-Id: I0632f641fdea9554779ac25d84bee4231d0d18f2
2019-03-20 18:41:00 -07:00
Rahat Mahmood 81f4829d11 Record sockets created during accept(2) for all families.
Track new sockets created during accept(2) in the socket table for all
families. Previously we were only doing this for unix domain sockets.

PiperOrigin-RevId: 239475550
Change-Id: I16f009f24a06245bfd1d72ffd2175200f837c6ac
2019-03-20 14:31:16 -07:00