Within gVisor, plumb new socket options to netstack.
Within netstack, fix GetSockOpt and SetSockOpt return value logic.
PiperOrigin-RevId: 226532229
Change-Id: If40734e119eed633335f40b4c26facbebc791c74
Previously, TCP_NODELAY was always enabled and we would lie about it being
configurable. TCP_NODELAY is now disabled by default (to match Linux) in the
socket layer so that non-gVisor users don't automatically start using this
questionable optimization.
PiperOrigin-RevId: 221368472
Change-Id: Ib0240f66d94455081f4e0ca94f09d9338b2c1356
This change also adds extensive testing to the p9 package via mocks. The sanity
checks and type checks are moved from the gofer into the core package, where
they can be more easily validated.
PiperOrigin-RevId: 218296768
Change-Id: I4fc3c326e7bf1e0e140a454cbacbcc6fd617ab55
* Integrate recvMsg and sendMsg functions into Recv and Send respectively as
they are no longer shared.
* Clean up partial read/write error handling code.
* Re-order code to make sense given that there is no longer a host.endpoint
type.
PiperOrigin-RevId: 217255072
Change-Id: Ib43fe9286452f813b8309d969be11f5fa40694cd
host.endpoint contained duplicated logic from the sockerpair implementation and
host.ConnectedEndpoint. Remove host.endpoint in favor of a
host.ConnectedEndpoint wrapped in a socketpair end.
PiperOrigin-RevId: 217240096
Change-Id: I4a3d51e3fe82bdf30e2d0152458b8499ab4c987c
Currently, in the face of FileMem fragmentation and a large sendmsg or
recvmsg call, host sockets may pass > 1024 iovecs to the host, which
will immediately cause the host to return EMSGSIZE.
When we detect this case, use a single intermediate buffer to pass to
the kernel, copying to/from the src/dst buffer.
To avoid creating unbounded intermediate buffers, enforce message size
checks and truncation w.r.t. the send buffer size. The same
functionality is added to netstack unix sockets for feature parity.
PiperOrigin-RevId: 216590198
Change-Id: I719a32e71c7b1098d5097f35e6daf7dd5190eff7
Previously, if address resolution for UDP or Ping sockets required sending
packets using Write in Transport layer, Resolve would return ErrWouldBlock
and Write would return ErrNoLinkAddress. Meanwhile startAddressResolution
would run in background. Further calls to Write using same address would also
return ErrNoLinkAddress until resolution has been completed successfully.
Since Write is not allowed to block and System Calls need to be
interruptible in System Call layer, the caller to Write is responsible for
blocking upon return of ErrWouldBlock.
Now, when startAddressResolution is called a notification channel for
the completion of the address resolution is returned.
The channel will traverse up to the calling function of Write as well as
ErrNoLinkAddress. Once address resolution is complete (success or not) the
channel is closed. The caller would call Write again to send packets and
check if address resolution was compeleted successfully or not.
Fixesgoogle/gvisor#5
Change-Id: Idafaf31982bee1915ca084da39ae7bd468cebd93
PiperOrigin-RevId: 214962200
From RFC7323#Section-4
The [RFC6298] RTT estimator has weighting factors, alpha and beta, based on an
implicit assumption that at most one RTTM will be sampled per RTT. When
multiple RTTMs per RTT are available to update the RTT estimator, an
implementation SHOULD try to adhere to the spirit of the history specified in
[RFC6298]. An implementation suggestion is detailed in Appendix G.
From RFC7323#appendix-G
Appendix G. RTO Calculation Modification
Taking multiple RTT samples per window would shorten the history calculated
by the RTO mechanism in [RFC6298], and the below algorithm aims to maintain a
similar history as originally intended by [RFC6298].
It is roughly known how many samples a congestion window worth of data will
yield, not accounting for ACK compression, and ACK losses. Such events will
result in more history of the path being reflected in the final value for
RTO, and are uncritical. This modification will ensure that a similar amount
of time is taken into account for the RTO estimation, regardless of how many
samples are taken per window:
ExpectedSamples = ceiling(FlightSize / (SMSS * 2))
alpha' = alpha / ExpectedSamples
beta' = beta / ExpectedSamples
Note that the factor 2 in ExpectedSamples is due to "Delayed ACKs".
Instead of using alpha and beta in the algorithm of [RFC6298], use alpha' and
beta' instead:
RTTVAR <- (1 - beta') * RTTVAR + beta' * |SRTT - R'|
SRTT <- (1 - alpha') * SRTT + alpha' * R'
(for each sample R')
PiperOrigin-RevId: 213644795
Change-Id: I52278b703540408938a8edb8c38be97b37f4a10e
Makes it possible to avoid copying or allocating in cases where DeliverNetworkPacket (rx)
needs to turn around and call WritePacket (tx) with its VectorisedView.
Also removes the restriction on having VectorisedViews with multiple views in the write path.
PiperOrigin-RevId: 211728717
Change-Id: Ie03a65ecb4e28bd15ebdb9c69f05eced18fdfcff
Furthermore, allow for the specification of an ElementMapper. This allows a
single "Element" type to exist on multiple inline lists, and work without
having to embed the entry type.
This is a requisite change for supporting a per-Inode list of Dirents.
PiperOrigin-RevId: 211467497
Change-Id: If2768999b43e03fdaecf8ed15f435fe37518d163
This CL does NDP link-address discovery for IPv6.
It includes several small changes necessary to get linux to talk to
this implementation. In particular, a hop limit of 255 is necessary
for ICMPv6.
PiperOrigin-RevId: 211103930
Change-Id: If25370ab84c6b1decfb15de917f3b0020f2c4e0e
Data race is:
Read:
(*connectionlessEndpoint).UnidirectionalConnect:
writeQueue: e.receiver.(*queueReceiver).readQueue,
Write:
(*connectionlessEndpoint).Close:
e.receiver = nil
The problem is that (*connectionlessEndpoint).UnidirectionalConnect assumed
that baseEndpoint.receiver is immutable which is explicitly not the case.
Fixing this required two changes:
1. Add synchronization around access of baseEndpoint.receiver in
(*connectionlessEndpoint).UnidirectionalConnect.
2. Check for baseEndpoint.receiver being nil in
(*connectionlessEndpoint).UnidirectionalConnect.
PiperOrigin-RevId: 207984402
Change-Id: Icddeeb43805e777fa3ef874329fa704891d14181
This CL implements CUBIC as described in https://tools.ietf.org/html/rfc8312.
PiperOrigin-RevId: 207353142
Change-Id: I329cbf3277f91127e99e488f07d906f6779c6603