Commit Graph

27 Commits

Author SHA1 Message Date
Bhasker Hariharan ae15d90436 FIFO QDisc implementation
Updates #231

PiperOrigin-RevId: 309323808
2020-04-30 16:41:00 -07:00
gVisor bot 4a73bae269 Initial network namespace support.
TCP/IP will work with netstack networking. hostinet doesn't work, and sockets
will have the same behavior as it is now.

Before the userspace is able to create device, the default loopback device can
be used to test.

/proc/net and /sys/net will still be connected to the root network stack; this
is the same behavior now.

Issue #1833

PiperOrigin-RevId: 296309389
2020-02-20 15:20:40 -08:00
gVisor bot b8e22e241c Disallow duplicate NIC names.
PiperOrigin-RevId: 294500858
2020-02-11 12:59:11 -08:00
Bhasker Hariharan f874723e64 Bump SO_SNDBUF for fdbased endpoint used by runsc.
Updates #231

PiperOrigin-RevId: 289897881
2020-01-15 11:19:06 -08:00
Bhasker Hariharan b9aa62b9f9 Enable IPv6 in runsc
Fixes #1341

PiperOrigin-RevId: 285108973
2019-12-11 19:14:26 -08:00
Andrei Vagin 8720bd643e netstack/tcp: software segmentation offload
Right now, we send each tcp packet separately, we call one system
call per-packet. This patch allows to generate multiple tcp packets
and send them by sendmmsg.

The arguable part of this CL is a way how to handle multiple headers.
This CL adds the next field to the Prepandable buffer.

Nginx test results:

Server Software:        nginx/1.15.9
Server Hostname:        10.138.0.2
Server Port:            8080

Document Path:          /10m.txt
Document Length:        10485760 bytes

w/o gso:
Concurrency Level:      5
Time taken for tests:   5.491 seconds
Complete requests:      100
Failed requests:        0
Total transferred:      1048600200 bytes
HTML transferred:       1048576000 bytes
Requests per second:    18.21 [#/sec] (mean)
Time per request:       274.525 [ms] (mean)
Time per request:       54.905 [ms] (mean, across all concurrent requests)
Transfer rate:          186508.03 [Kbytes/sec] received

sw-gso:

Concurrency Level:      5
Time taken for tests:   3.852 seconds
Complete requests:      100
Failed requests:        0
Total transferred:      1048600200 bytes
HTML transferred:       1048576000 bytes
Requests per second:    25.96 [#/sec] (mean)
Time per request:       192.576 [ms] (mean)
Time per request:       38.515 [ms] (mean, across all concurrent requests)
Transfer rate:          265874.92 [Kbytes/sec] received

w/o gso:
$ ./tcp_benchmark --client --duration 15  --ideal
[SUM]  0.0-15.1 sec  2.20 GBytes  1.25 Gbits/sec

software gso:
$ tcp_benchmark --client --duration 15  --ideal --gso $((1<<16)) --swgso
[SUM]  0.0-15.1 sec  3.99 GBytes  2.26 Gbits/sec

PiperOrigin-RevId: 276112677
2019-10-22 11:55:56 -07:00
Tamir Duberstein 573e6e4bba Use tcpip.Subnet in tcpip.Route
This is the first step in replacing some of the redundant types with the
standard library equivalents.

PiperOrigin-RevId: 264706552
2019-08-21 15:31:18 -07:00
Ian Lewis 885e17f890 Remove unused const variables
PiperOrigin-RevId: 260824989
2019-07-30 16:56:23 -07:00
Adin Scannell add40fd6ad Update canonical repository.
This can be merged after:
https://github.com/google/gvisor-website/pull/77
  or
https://github.com/google/gvisor-website/pull/78

PiperOrigin-RevId: 253132620
2019-06-13 16:50:15 -07:00
Fabricio Voznika 847c4b9759 Use net.HardwareAddr for FDBasedLink.LinkAddress
It prints formatted to the log.

PiperOrigin-RevId: 252699551
2019-06-11 14:31:46 -07:00
Bhasker Hariharan 85be01b42d Add multi-fd support to fdbased endpoint.
This allows an fdbased endpoint to have multiple underlying fd's from which
packets can be read and dispatched/written to.

This should allow for higher throughput as well as better scalability of the
network stack as number of connections increases.

Updates #231

PiperOrigin-RevId: 251852825
2019-06-06 08:07:02 -07:00
Andrei Vagin 85380ff03d gvisor/runsc: use a veth link address instead of generating a new one
PiperOrigin-RevId: 248367340
Change-Id: Id792afcfff9c9d2cfd62cae21048316267b4a924
2019-05-15 11:11:58 -07:00
Michael Pratt 4d52a55201 Change copyright notice to "The gVisor Authors"
Based on the guidelines at
https://opensource.google.com/docs/releasing/authors/.

1. $ rg -l "Google LLC" | xargs sed -i 's/Google LLC.*/The gVisor Authors./'
2. Manual fixup of "Google Inc" references.
3. Add AUTHORS file. Authors may request to be added to this file.
4. Point netstack AUTHORS to gVisor AUTHORS. Drop CONTRIBUTORS.

Fixes #209

PiperOrigin-RevId: 245823212
Change-Id: I64530b24ad021a7d683137459cafc510f5ee1de9
2019-04-29 14:26:23 -07:00
Bhasker Hariharan 228dc15fd1 Bump the AF_PACKET socket rcv buf size to 4MB by default.
Packet socket receive buffers default to the sysctl value of
net.core.rmem_default and are capped by net.core.rmem_max both
which are usually set to 208KB on most systems.

Since we can't expect every gVisor user to bump these we use
SO_RCVBUFFORCE to exceed the limit. This is possible as runsc runs
with CAP_NET_ADMIN outside the sandbox and can do this before
the FD is passed to the sentry inside the sandbox.

Updates #211

iperf output w/ 4MB buffer.

 iperf3 -c 172.17.0.2 -t 100
 Connecting to host 172.17.0.2, port 5201
 [  4] local 172.17.0.1 port 40378 connected to 172.17.0.2 port 5201
 [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
 [  4]   0.00-1.00   sec  1.15 GBytes  9.89 Gbits/sec    0   1.02 MBytes
 [  4]   1.00-2.00   sec  1.18 GBytes  10.2 Gbits/sec    0   1.02 MBytes
 [  4]   2.00-3.00   sec   965 MBytes  8.09 Gbits/sec    0   1.02 MBytes
 [  4]   3.00-4.00   sec   942 MBytes  7.90 Gbits/sec    0   1.02 MBytes
 [  4]   4.00-5.00   sec   952 MBytes  7.99 Gbits/sec    0   1.02 MBytes
 [  4]   5.00-6.00   sec  1.14 GBytes  9.81 Gbits/sec    0   1.02 MBytes
 [  4]   6.00-7.00   sec  1.13 GBytes  9.68 Gbits/sec    0   1.02 MBytes
 [  4]   7.00-8.00   sec   930 MBytes  7.80 Gbits/sec    0   1.02 MBytes
 [  4]   8.00-9.00   sec  1.15 GBytes  9.91 Gbits/sec    0   1.02 MBytes
 [  4]   9.00-10.00  sec   938 MBytes  7.87 Gbits/sec    0   1.02 MBytes
 [  4]  10.00-11.00  sec   737 MBytes  6.18 Gbits/sec    0   1.02 MBytes
 [  4]  11.00-12.00  sec  1.16 GBytes  9.93 Gbits/sec    0   1.02 MBytes
 [  4]  12.00-13.00  sec   917 MBytes  7.69 Gbits/sec    0   1.02 MBytes
 [  4]  13.00-14.00  sec  1.19 GBytes  10.2 Gbits/sec    0   1.02 MBytes
 [  4]  14.00-15.00  sec  1.01 GBytes  8.70 Gbits/sec    0   1.02 MBytes
 [  4]  15.00-16.00  sec  1.20 GBytes  10.3 Gbits/sec    0   1.02 MBytes
 [  4]  16.00-17.00  sec  1.14 GBytes  9.80 Gbits/sec    0   1.02 MBytes
 ^C[  4]  17.00-17.60  sec   718 MBytes  10.1 Gbits/sec    0   1.02 MBytes
 - - - - - - - - - - - - - - - - - - - - - - - - -
 [ ID] Interval           Transfer     Bandwidth       Retr
 [  4]   0.00-17.60  sec  18.4 GBytes  8.98 Gbits/sec    0             sender
 [  4]   0.00-17.60  sec  0.00 Bytes  0.00 bits/sec                  receiver

PiperOrigin-RevId: 245470590
Change-Id: I1c08c5ee8345de6ac070513656a4703312dc3c00
2019-04-26 12:52:02 -07:00
Fabricio Voznika 908edee04f Replace os.File with fd.FD in fsgofer
os.NewFile() accounts for 38% of CPU time in localFile.Walk().
This change switchs to use fd.FD which is much cheaper to create.
Now, fd.New() in localFile.Walk() accounts for only 4%.

PiperOrigin-RevId: 244944983
Change-Id: Ic892df96cf2633e78ad379227a213cb93ee0ca46
2019-04-23 16:10:54 -07:00
Andrei Vagin a046054ba3 gvisor/runsc: enable generic segmentation offload (GSO)
The linux packet socket can handle GSO packets, so we can segment packets to
64K instead of the MTU which is usually 1500.

Here are numbers for the nginx-1m test:
runsc:		579330.01 [Kbytes/sec] received
runsc-gso:	1794121.66 [Kbytes/sec] received
runc:		2122139.06 [Kbytes/sec] received

and for tcp_benchmark:

$ tcp_benchmark  --duration 15   --ideal
[  4]  0.0-15.0 sec  86647 MBytes  48456 Mbits/sec

$ tcp_benchmark --client --duration 15   --ideal
[  4]  0.0-15.0 sec  2173 MBytes  1214 Mbits/sec

$ tcp_benchmark --client --duration 15   --ideal --gso 65536
[  4]  0.0-15.0 sec  19357 MBytes  10825 Mbits/sec

PiperOrigin-RevId: 241072403
Change-Id: I20b03063a1a6649362b43609cbbc9b59be06e6d5
2019-03-29 16:27:38 -07:00
Shijiang Wei b44699c529 check isRootNS by ns inode
Signed-off-by: Shijiang Wei <mountkin@gmail.com>
Change-Id: I032f834edae5c716fb2d3538285eec07aa11a902
PiperOrigin-RevId: 231318438
2019-01-28 17:20:20 -08:00
Fabricio Voznika c1be25b78d Scrub runsc error messages
Removed "error" and "failed to" prefix that don't add value
from messages. Adjusted a few other messages.  In particular,
when the container fail to start, the message returned is easier
for humans to read:

$ docker run --rm --runtime=runsc alpine foobar
docker: Error response from daemon: OCI runtime start failed: <path> did not terminate sucessfully: starting container: starting root container [foobar]: starting sandbox: searching for executable "foobar", cwd: "/", $PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin": no such file or directory

Closes #77

PiperOrigin-RevId: 230022798
Change-Id: I83339017c70dae09e4f9f8e0ea2e554c4d5d5cd1
2019-01-18 17:36:02 -08:00
Ian Gudger 8fce67af24 Use correct company name in copyright header
PiperOrigin-RevId: 217951017
Change-Id: Ie08bf6987f98467d07457bcf35b5f1ff6e43c035
2018-10-19 16:35:11 -07:00
Fabricio Voznika a2ad8fef13 Make multi-container the default mode for runsc
And remove multicontainer option.

PiperOrigin-RevId: 215236981
Change-Id: I9fd1d963d987e421e63d5817f91a25c819ced6cb
2018-10-01 10:31:17 -07:00
Fabricio Voznika efac28976c Enable network for multi-container
PiperOrigin-RevId: 211834411
Change-Id: I52311a6c5407f984e5069359d9444027084e4d2a
2018-09-06 11:00:08 -07:00
Fabricio Voznika db81c0b02f Put fsgofer inside chroot
Now each container gets its own dedicated gofer that is chroot'd to the
rootfs path. This is done to add an extra layer of security in case the
gofer gets compromised.

PiperOrigin-RevId: 210396476
Change-Id: Iba21360a59dfe90875d61000db103f8609157ca0
2018-08-27 11:10:14 -07:00
Fabricio Voznika bc9a1fca23 Tiny reordering to network code
PiperOrigin-RevId: 207581723
Change-Id: I6e4eb1227b5ed302de5e6c891040b670955f1eea
2018-08-06 11:48:29 -07:00
Lantao Liu f62d6dd453 runsc: copy gateway from the pod network interface.
PiperOrigin-RevId: 205334841
Change-Id: Ia60d486f9aae70182fdc4af50cf7c915986126d7
2018-07-19 18:09:56 -07:00
Fabricio Voznika 65dadc0029 Ignores IPv6 addresses when configuring network
Closes #60

PiperOrigin-RevId: 198887885
Change-Id: I9bf990ee3fde9259836e57d67257bef5b85c6008
2018-06-01 10:09:37 -07:00
Nicolas Lacasse 32cabad8da Use the containerd annotation instead of detecting the "pause" application.
FIXED=72380268
PiperOrigin-RevId: 195846596
Change-Id: Ic87fed1433482a514631e1e72f5ee208e11290d1
2018-05-08 11:11:50 -07:00
Googler d02b74a5dc Check in gVisor.
PiperOrigin-RevId: 194583126
Change-Id: Ica1d8821a90f74e7e745962d71801c598c652463
2018-04-28 01:44:26 -04:00