532 lines
22 KiB
Markdown
532 lines
22 KiB
Markdown
# Packetimpact
|
|
|
|
## What is packetimpact?
|
|
|
|
Packetimpact is a tool for platform-independent network testing. It is heavily
|
|
inspired by [packetdrill](https://github.com/google/packetdrill). It creates two
|
|
docker containers connected by a network. One is for the test bench, which
|
|
operates the test. The other is for the device-under-test (DUT), which is the
|
|
software being tested. The test bench communicates over the network with the DUT
|
|
to check correctness of the network.
|
|
|
|
### Goals
|
|
|
|
Packetimpact aims to provide:
|
|
|
|
* A **multi-platform** solution that can test both Linux and gVisor.
|
|
* **Conciseness** on par with packetdrill scripts.
|
|
* **Control-flow** like for loops, conditionals, and variables.
|
|
* **Flexibilty** to specify every byte in a packet or use multiple sockets.
|
|
|
|
## When to use packetimpact?
|
|
|
|
There are a few ways to write networking tests for gVisor currently:
|
|
|
|
* [Go unit tests](https://github.com/google/gvisor/tree/master/pkg/tcpip)
|
|
* [syscall tests](https://github.com/google/gvisor/tree/master/test/syscalls/linux)
|
|
* [packetdrill tests](https://github.com/google/gvisor/tree/master/test/packetdrill)
|
|
* packetimpact tests
|
|
|
|
The right choice depends on the needs of the test.
|
|
|
|
Feature | Go unit test | syscall test | packetdrill | packetimpact
|
|
------------- | ------------ | ------------ | ----------- | ------------
|
|
Multiplatform | no | **YES** | **YES** | **YES**
|
|
Concise | no | somewhat | somewhat | **VERY**
|
|
Control-flow | **YES** | **YES** | no | **YES**
|
|
Flexible | **VERY** | no | somewhat | **VERY**
|
|
|
|
### Go unit tests
|
|
|
|
If the test depends on the internals of gVisor and doesn't need to run on Linux
|
|
or other platforms for comparison purposes, a Go unit test can be appropriate.
|
|
They can observe internals of gVisor networking. The downside is that they are
|
|
**not concise** and **not multiplatform**. If you require insight on gVisor
|
|
internals, this is the right choice.
|
|
|
|
### Syscall tests
|
|
|
|
Syscall tests are **multiplatform** but cannot examine the internals of gVisor
|
|
networking. They are **concise**. They can use **control-flow** structures like
|
|
conditionals, for loops, and variables. However, they are limited to only what
|
|
the POSIX interface provides so they are **not flexible**. For example, you
|
|
would have difficulty writing a syscall test that intentionally sends a bad IP
|
|
checksum. Or if you did write that test with raw sockets, it would be very
|
|
**verbose** to write a test that intentionally send wrong checksums, wrong
|
|
protocols, wrong sequence numbers, etc.
|
|
|
|
### Packetdrill tests
|
|
|
|
Packetdrill tests are **multiplatform** and can run against both Linux and
|
|
gVisor. They are **concise** and use a special packetdrill scripting language.
|
|
They are **more flexible** than a syscall test in that they can send packets
|
|
that a syscall test would have difficulty sending, like a packet with a
|
|
calcuated ACK number. But they are also somewhat limimted in flexibiilty in that
|
|
they can't do tests with multiple sockets. They have **no control-flow** ability
|
|
like variables or conditionals. For example, it isn't possible to send a packet
|
|
that depends on the window size of a previous packet because the packetdrill
|
|
language can't express that. Nor could you branch based on whether or not the
|
|
other side supports window scaling, for example.
|
|
|
|
### Packetimpact tests
|
|
|
|
Packetimpact tests are similar to Packetdrill tests except that they are written
|
|
in Go instead of the packetdrill scripting language. That gives them all the
|
|
**control-flow** abilities of Go (loops, functions, variables, etc). They are
|
|
**multiplatform** in the same way as packetdrill tests but even more
|
|
**flexible** because Go is more expressive than the scripting language of
|
|
packetdrill. However, Go is **not as concise** as the packetdrill language. Many
|
|
design decisions below are made to mitigate that.
|
|
|
|
## How it works
|
|
|
|
```
|
|
+--------------+ +--------------+
|
|
| | TEST NET | |
|
|
| | <===========> | Device |
|
|
| Test | | Under |
|
|
| Bench | | Test |
|
|
| | <===========> | (DUT) |
|
|
| | CONTROL NET | |
|
|
+--------------+ +--------------+
|
|
```
|
|
|
|
Two docker containers are created by a script, one for the test bench and the
|
|
other for the device under test (DUT). The script connects the two containers
|
|
with a control network and test network. It also does some other tasks like
|
|
waiting until the DUT is ready before starting the test and disabling Linux
|
|
networking that would interfere with the test bench.
|
|
|
|
### DUT
|
|
|
|
The DUT container runs a program called the "posix_server". The posix_server is
|
|
written in c++ for maximum portability. It is compiled on the host. The script
|
|
that starts the containers copies it into the DUT's container and runs it. It's
|
|
job is to receive directions from the test bench on what actions to take. For
|
|
this, the posix_server does three steps in a loop:
|
|
|
|
1. Listen for a request from the test bench.
|
|
2. Execute a command.
|
|
3. Send the response back to the test bench.
|
|
|
|
The requests and responses are
|
|
[protobufs](https://developers.google.com/protocol-buffers) and the
|
|
communication is done with [gRPC](https://grpc.io/). The commands run are
|
|
[POSIX socket commands](https://en.wikipedia.org/wiki/Berkeley_sockets#Socket_API_functions),
|
|
with the inputs and outputs converted into protobuf requests and responses. All
|
|
communication is on the control network, so that the test network is unaffected
|
|
by extra packets.
|
|
|
|
For example, this is the request and response pair to call
|
|
[`socket()`](http://man7.org/linux/man-pages/man2/socket.2.html):
|
|
|
|
```protocol-buffer
|
|
message SocketRequest {
|
|
int32 domain = 1;
|
|
int32 type = 2;
|
|
int32 protocol = 3;
|
|
}
|
|
|
|
message SocketResponse {
|
|
int32 fd = 1;
|
|
int32 errno_ = 2;
|
|
}
|
|
```
|
|
|
|
##### Alternatives considered
|
|
|
|
* We could have use JSON for communication instead. It would have been a
|
|
lighter-touch than protobuf but protobuf handles all the data type and has
|
|
strict typing to prevent a class of errors. The test bench could be written
|
|
in other languages, too.
|
|
* Instead of mimicking the POSIX interfaces, arguments could have had a more
|
|
natural form, like the `bind()` getting a string IP address instead of bytes
|
|
in a `sockaddr_t`. However, conforming to the existing structures keeps more
|
|
of the complexity in Go and keeps the posix_server simpler and thus more
|
|
likely to compile everywhere.
|
|
|
|
### Test Bench
|
|
|
|
The test bench does most of the work in a test. It is a Go program that compiles
|
|
on the host and is copied by the script into test bench's container. It is a
|
|
regular [go unit test](https://golang.org/pkg/testing/) that imports the test
|
|
bench framework. The test bench framwork is based on three basic utilities:
|
|
|
|
* Commanding the DUT to run POSIX commands and return responses.
|
|
* Sending raw packets to the DUT on the test network.
|
|
* Listening for raw packets from the DUT on the test network.
|
|
|
|
#### DUT commands
|
|
|
|
To keep the interface to the DUT consistent and easy-to-use, each POSIX command
|
|
supported by the posix_server is wrapped in functions with signatures similar to
|
|
the ones in the [Go unix package](https://godoc.org/golang.org/x/sys/unix). This
|
|
way all the details of endianess and (un)marshalling of go structs such as
|
|
[unix.Timeval](https://godoc.org/golang.org/x/sys/unix#Timeval) is handled in
|
|
one place. This also makes it straight-forward to convert tests that use `unix.`
|
|
or `syscall.` calls to `dut.` calls.
|
|
|
|
For example, creating a connection to the DUT and commanding it to make a socket
|
|
looks like this:
|
|
|
|
```go
|
|
dut := testbench.NewDut(t)
|
|
fd, err := dut.SocketWithErrno(unix.AF_INET, unix.SOCK_STREAM, unix.IPPROTO_IP)
|
|
if fd < 0 {
|
|
t.Fatalf(...)
|
|
}
|
|
```
|
|
|
|
Because the usual case is to fail the test when the DUT fails to create a
|
|
socket, there is a concise version of each of the `...WithErrno` functions that
|
|
does that:
|
|
|
|
```go
|
|
dut := testbench.NewDut(t)
|
|
fd := dut.Socket(unix.AF_INET, unix.SOCK_STREAM, unix.IPPROTO_IP)
|
|
```
|
|
|
|
The DUT and other structs in the code store a `*testing.T` so that they can
|
|
provide versions of functions that call `t.Fatalf(...)`. This helps keep tests
|
|
concise.
|
|
|
|
##### Alternatives considered
|
|
|
|
* Instead of mimicking the `unix.` go interface, we could have invented a more
|
|
natural one, like using `float64` instead of `Timeval`. However, using the
|
|
same function signatures that `unix.` has makes it easier to convert code to
|
|
`dut.`. Also, using an existing interface ensures that we don't invent an
|
|
interface that isn't extensible. For example, if we invented a function for
|
|
`bind()` that didn't support IPv6 and later we had to add a second `bind6()`
|
|
function.
|
|
|
|
#### Sending/Receiving Raw Packets
|
|
|
|
The framework wraps POSIX sockets for sending and receiving raw frames. Both
|
|
send and receive are synchronous commands.
|
|
[SO_RCVTIMEO](http://man7.org/linux/man-pages/man7/socket.7.html) is used to set
|
|
a timeout on the receive commands. For ease of use, these are wrapped in an
|
|
`Injector` and a `Sniffer`. They have functions:
|
|
|
|
```go
|
|
func (s *Sniffer) Recv(timeout time.Duration) []byte {...}
|
|
func (i *Injector) Send(b []byte) {...}
|
|
```
|
|
|
|
##### Alternatives considered
|
|
|
|
* [gopacket](https://github.com/google/gopacket) pcap has raw socket support
|
|
but requires cgo. cgo is not guaranteed to be portable from the host to the
|
|
container and in practice, the container doesn't recognize binaries built on
|
|
the host if they use cgo.
|
|
* Both gVisor and gopacket have the ability to read and write pcap files
|
|
without cgo but that is insufficient here.
|
|
* The sniffer and injector can't share a socket because they need to be bound
|
|
differently.
|
|
* Sniffing could have been done asynchronously with channels, obviating the
|
|
need for `SO_RCVTIMEO`. But that would introduce asynchronous complication.
|
|
`SO_RCVTIMEO` is well supported on the test bench.
|
|
|
|
#### `Layer` struct
|
|
|
|
A large part of packetimpact tests is creating packets to send and comparing
|
|
received packets against expectations. To keep tests concise, it is useful to be
|
|
able to specify just the important parts of packets that need to be set. For
|
|
example, sending a packet with default values except for TCP Flags. And for
|
|
packets received, it's useful to be able to compare just the necessary parts of
|
|
received packets and ignore the rest.
|
|
|
|
To aid in both of those, Go structs with optional fields are created for each
|
|
encapsulation type, such as IPv4, TCP, and Ethernet. This is inspired by
|
|
[scapy](https://scapy.readthedocs.io/en/latest/). For example, here is the
|
|
struct for Ethernet:
|
|
|
|
```go
|
|
type Ether struct {
|
|
LayerBase
|
|
SrcAddr *tcpip.LinkAddress
|
|
DstAddr *tcpip.LinkAddress
|
|
Type *tcpip.NetworkProtocolNumber
|
|
}
|
|
```
|
|
|
|
Each struct has the same fields as those in the
|
|
[gVisor headers](https://github.com/google/gvisor/tree/master/pkg/tcpip/header)
|
|
but with a pointer for each field that may be `nil`.
|
|
|
|
##### Alternatives considered
|
|
|
|
* Just use []byte like gVisor headers do. The drawback is that it makes the
|
|
tests more verbose.
|
|
* For example, there would be no way to call `Send(myBytes)` concisely and
|
|
indicate if the checksum should be calculated automatically versus
|
|
overridden. The only way would be to add lines to the test to calculate
|
|
it before each Send, which is wordy. Or make multiple versions of Send:
|
|
one that checksums IP, one that doesn't, one that checksums TCP, one
|
|
that does both, etc. That would be many combinations.
|
|
* Filtering inputs would become verbose. Either:
|
|
* large conditionals that need to be repeated many places:
|
|
`h[FlagOffset] == SYN && h[LengthOffset:LengthOffset+2] == ...` or
|
|
* Many functions, one per field, like: `filterByFlag(myBytes, SYN)`,
|
|
`filterByLength(myBytes, 20)`, `filterByNextProto(myBytes, 0x8000)`,
|
|
etc.
|
|
* Using pointers allows us to combine `Layer`s with a one-line call to
|
|
`mergo.Merge(...)`. So the default `Layers` can be overridden by a
|
|
`Layers` with just the TCP conection's src/dst which can be overridden
|
|
by one with just a test specific TCP window size. Each override is
|
|
specified as just one call to `mergo.Merge`.
|
|
* It's a proven way to separate the details of a packet from the byte
|
|
format as shown by scapy's success.
|
|
* Use packetgo. It's more general than parsing packets with gVisor. However:
|
|
* packetgo doesn't have optional fields so many of the above problems
|
|
still apply.
|
|
* It would be yet another dependency.
|
|
* It's not as well known to engineers that are already writing gVisor
|
|
code.
|
|
* It might be a good candidate for replacing the parsing of packets into
|
|
`Layer`s if all that parsing turns out to be more work than parsing by
|
|
packetgo and converting *that* to `Layer`. packetgo has easier to use
|
|
getters for the layers. This could be done later in a way that doesn't
|
|
break tests.
|
|
|
|
#### `Layer` methods
|
|
|
|
The `Layer` structs provide a way to partially specify an encapsulation. They
|
|
also need methods for using those partially specified encapsulation, for example
|
|
to marshal them to bytes or compare them. For those, each encapsulation
|
|
implements the `Layer` interface:
|
|
|
|
```go
|
|
// Layer is the interface that all encapsulations must implement.
|
|
//
|
|
// A Layer is an encapsulation in a packet, such as TCP, IPv4, IPv6, etc. A
|
|
// Layer contains all the fields of the encapsulation. Each field is a pointer
|
|
// and may be nil.
|
|
type Layer interface {
|
|
// toBytes converts the Layer into bytes. In places where the Layer's field
|
|
// isn't nil, the value that is pointed to is used. When the field is nil, a
|
|
// reasonable default for the Layer is used. For example, "64" for IPv4 TTL
|
|
// and a calculated checksum for TCP or IP. Some layers require information
|
|
// from the previous or next layers in order to compute a default, such as
|
|
// TCP's checksum or Ethernet's type, so each Layer has a doubly-linked list
|
|
// to the layer's neighbors.
|
|
toBytes() ([]byte, error)
|
|
|
|
// match checks if the current Layer matches the provided Layer. If either
|
|
// Layer has a nil in a given field, that field is considered matching.
|
|
// Otherwise, the values pointed to by the fields must match.
|
|
match(Layer) bool
|
|
|
|
// length in bytes of the current encapsulation
|
|
length() int
|
|
|
|
// next gets a pointer to the encapsulated Layer.
|
|
next() Layer
|
|
|
|
// prev gets a pointer to the Layer encapsulating this one.
|
|
prev() Layer
|
|
|
|
// setNext sets the pointer to the encapsulated Layer.
|
|
setNext(Layer)
|
|
|
|
// setPrev sets the pointer to the Layer encapsulating this one.
|
|
setPrev(Layer)
|
|
}
|
|
```
|
|
|
|
For each `Layer` there is also a parsing function. For example, this one is for
|
|
Ethernet:
|
|
|
|
```
|
|
func ParseEther(b []byte) (Layers, error)
|
|
```
|
|
|
|
The parsing function converts bytes received on the wire into a `Layer`
|
|
(actually `Layers`, see below) which has no `nil`s in it. By using
|
|
`match(Layer)` to compare against another `Layer` that *does* have `nil`s in it,
|
|
the received bytes can be partially compared. The `nil`s behave as
|
|
"don't-cares".
|
|
|
|
##### Alternatives considered
|
|
|
|
* Matching against `[]byte` instead of converting to `Layer` first.
|
|
* The downside is that it precludes the use of a `cmp.Equal` one-liner to
|
|
do comparisons.
|
|
* It creates confusion in the code to deal with both representations at
|
|
different times. For example, is the checksum calculated on `[]byte` or
|
|
`Layer` when sending? What about when checking received packets?
|
|
|
|
#### `Layers`
|
|
|
|
```
|
|
type Layers []Layer
|
|
|
|
func (ls *Layers) match(other Layers) bool {...}
|
|
func (ls *Layers) toBytes() ([]byte, error) {...}
|
|
```
|
|
|
|
`Layers` is an array of `Layer`. It represents a stack of encapsulations, such
|
|
as `Layers{Ether{},IPv4{},TCP{},Payload{}}`. It also has `toBytes()` and
|
|
`match(Layers)`, like `Layer`. The parse functions above actually return
|
|
`Layers` and not `Layer` because they know about the headers below and
|
|
sequentially call each parser on the remaining, encapsulated bytes.
|
|
|
|
All this leads to the ability to write concise packet processing. For example:
|
|
|
|
```go
|
|
etherType := 0x8000
|
|
flags = uint8(header.TCPFlagSyn|header.TCPFlagAck)
|
|
toMatch := Layers{Ether{Type: ðerType}, IPv4{}, TCP{Flags: &flags}}
|
|
for {
|
|
recvBytes := sniffer.Recv(time.Second)
|
|
if recvBytes == nil {
|
|
println("Got no packet for 1 second")
|
|
}
|
|
gotPacket, err := ParseEther(recvBytes)
|
|
if err == nil && toMatch.match(gotPacket) {
|
|
println("Got a TCP/IPv4/Eth packet with SYNACK")
|
|
}
|
|
}
|
|
```
|
|
|
|
##### Alternatives considered
|
|
|
|
* Don't use previous and next pointers.
|
|
* Each layer may need to be able to interrogate the layers aroung it, like
|
|
for computing the next protocol number or total length. So *some*
|
|
mechanism is needed for a `Layer` to see neighboring layers.
|
|
* We could pass the entire array `Layers` to the `toBytes()` function.
|
|
Passing an array to a method that includes in the array the function
|
|
receiver itself seems wrong.
|
|
|
|
#### Connections
|
|
|
|
Using `Layers` above, we can create connection structures to maintain state
|
|
about connections. For example, here is the `TCPIPv4` struct:
|
|
|
|
```
|
|
type TCPIPv4 struct {
|
|
outgoing Layers
|
|
incoming Layers
|
|
localSeqNum uint32
|
|
remoteSeqNum uint32
|
|
sniffer Sniffer
|
|
injector Injector
|
|
t *testing.T
|
|
}
|
|
```
|
|
|
|
`TCPIPv4` contains an `outgoing Layers` which holds the defaults for the
|
|
connection, such as the source and destination MACs, IPs, and ports. When
|
|
`outgoing.toBytes()` is called a valid packet for this TCPIPv4 flow is built.
|
|
|
|
It also contains `incoming Layers` which holds filter for incoming packets that
|
|
belong to this flow. `incoming.match(Layers)` is used on received bytes to check
|
|
if they are part of the flow.
|
|
|
|
The `sniffer` and `injector` are for receiving and sending raw packet bytes. The
|
|
`localSeqNum` and `remoteSeqNum` are updated by `Send` and `Recv` so that
|
|
outgoing packets will have, by default, the correct sequence number and ack
|
|
number.
|
|
|
|
TCPIPv4 provides some functions:
|
|
|
|
```
|
|
func (conn *TCPIPv4) Send(tcp TCP) {...}
|
|
func (conn *TCPIPv4) Recv(timeout time.Duration) *TCP {...}
|
|
```
|
|
|
|
`Send(tcp TCP)` uses [mergo](https://github.com/imdario/mergo) to merge the
|
|
provided `TCP` (a `Layer`) into `outgoing`. This way the user can specify
|
|
concisely just which fields of `outgoing` to modify. The packet is sent using
|
|
the `injector`.
|
|
|
|
`Recv(timeout time.Duration)` reads packets from the sniffer until either the
|
|
timeout has elapsed or a packet that matches `incoming` arrives.
|
|
|
|
Using those, we can perform a TCP 3-way handshake without too much code:
|
|
|
|
```go
|
|
func (conn *TCPIPv4) Handshake() {
|
|
syn := uint8(header.TCPFlagSyn)
|
|
synack := uint8(header.TCPFlagSyn)
|
|
ack := uint8(header.TCPFlagAck)
|
|
conn.Send(TCP{Flags: &syn}) // Send a packet with all defaults but set TCP-SYN.
|
|
|
|
// Wait for the SYN-ACK response.
|
|
for {
|
|
newTCP := conn.Recv(time.Second) // This already filters by MAC, IP, and ports.
|
|
if TCP{Flags: &synack}.match(newTCP) {
|
|
break // Only if it's a SYN-ACK proceed.
|
|
}
|
|
}
|
|
|
|
conn.Send(TCP{Flags: &ack}) // Send an ACK. The seq and ack numbers are set correctly.
|
|
}
|
|
```
|
|
|
|
The handshake code is part of the testbench utilities so tests can share this
|
|
common sequence, making tests even more concise.
|
|
|
|
##### Alternatives considered
|
|
|
|
* Instead of storing `outgoing` and `incoming`, store values.
|
|
* There would be many more things to store instead, like `localMac`,
|
|
`remoteMac`, `localIP`, `remoteIP`, `localPort`, and `remotePort`.
|
|
* Construction of a packet would be many lines to copy each of these
|
|
values into a `[]byte`. And there would be slight variations needed for
|
|
each encapsulation stack, like TCPIPv6 and ARP.
|
|
* Filtering incoming packets would be a long sequence:
|
|
* Compare the MACs, then
|
|
* Parse the next header, then
|
|
* Compare the IPs, then
|
|
* Parse the next header, then
|
|
* Compare the TCP ports. Instead it's all just one call to
|
|
`cmp.Equal(...)`, for all sequences.
|
|
* A TCPIPv6 connection could share most of the code. Only the type of the
|
|
IP addresses are different. The types of `outgoing` and `incoming` would
|
|
be remain `Layers`.
|
|
* An ARP connection could share all the Ethernet parts. The IP `Layer`
|
|
could be factored out of `outgoing`. After that, the IPv4 and IPv6
|
|
connections could implement one interface and a single TCP struct could
|
|
have either network protocol through composition.
|
|
|
|
## Putting it all together
|
|
|
|
Here's what te start of a packetimpact unit test looks like. This test creates a
|
|
TCP connection with the DUT. There are added comments for explanation in this
|
|
document but a real test might not include them in order to stay even more
|
|
concise.
|
|
|
|
```go
|
|
func TestMyTcpTest(t *testing.T) {
|
|
// Prepare a DUT for communication.
|
|
dut := testbench.NewDUT(t)
|
|
|
|
// This does:
|
|
// dut.Socket()
|
|
// dut.Bind()
|
|
// dut.Getsockname() to learn the new port number
|
|
// dut.Listen()
|
|
listenFD, remotePort := dut.CreateListener(unix.SOCK_STREAM, unix.IPPROTO_TCP, 1)
|
|
defer dut.Close(listenFD) // Tell the DUT to close the socket at the end of the test.
|
|
|
|
// Monitor a new TCP connection with sniffer, injector, sequence number tracking,
|
|
// and reasonable outgoing and incoming packet field default IPs, MACs, and port numbers.
|
|
conn := testbench.NewTCPIPv4(t, dut, remotePort)
|
|
|
|
// Perform a 3-way handshake: send SYN, expect SYNACK, send ACK.
|
|
conn.Handshake()
|
|
|
|
// Tell the DUT to accept the new connection.
|
|
acceptFD := dut.Accept(acceptFd)
|
|
}
|
|
```
|
|
|
|
## Other notes
|
|
|
|
* The time between receiving a SYN-ACK and replying with an ACK in `Handshake`
|
|
is about 3ms. This is much slower than the native unix response, which is
|
|
about 0.3ms. Packetdrill gets closer to 0.3ms. For tests where timing is
|
|
crucial, packetdrill is faster and more precise.
|