[noise] Negotiation and 0-RTT

Thu Jul 6 00:24:07 PDT 2017

The thread on a "1-RTT protocol strawman" proposed a simple
negotiation scheme.  Version negotiation is tricky, and handling it
with Noise is non-obvious, so this scheme might be independently
useful.

I'll discuss it in more detail, and consider how it can be extended to
support 0-RTT encryption.

Negotiation is tricky because the client has to start doing crypto
(hashing transcript, sending an ephemeral, zero-RTT encryption) before
the server has made its final choice.

So in Noise terms, what is the client doing in this initial message?
Is it executing some pre-Noise protocol?  Executing a partial Noise
protocol?  Executing multiple Noise protocols in parallel?

The decision here is different from all of those:  The client starts
off executing an "initial" Noise protocol, and if the server prefers a
different protocol, then a "fallback" handshake occurs.  This
leverages our existing concepts: we'd always be executing some Noise
protocol, and the initial and fallback protocols can be chained
together by putting the initial protocol's handshake hash into the
fallback protocol's prologue.

In more detail:
 - Client starts executing an "initial" Noise protocol, and indicates
this protocol via a version number in the initial message header.
 - Client sends any advertisements about new versions in the cleartext
payload of this initial message (details up to the aplication).
 - Server peeks at this version number and initializes to the same
Noise protocol, then reads the initial message's cleartext payload.
 - If the server wishes to continue with this Noise protocol it echoes
server_version = client_version.  Otherwise, the server switches to a
"fallback" Noise protocol, and returns server_version !=
client_version, with the initial protocol's handshake hash used as
prologue, along with the version numbers.

Some subtleties:

 * This is a different use of "fallback" patterns than we've
considered before.  In particular, the initial protocol is likely to
be an older version that all servers support, whereas the "fallback"
protocol is likely to be a newer protocol that the server opts into.
Which means "fallback" is perhaps not a great name here, but we might
be stuck with it.

 * Another subtlety is that while the client can send arbitrary
negotiation data in the initial cleartext payload, the server doesn't
have a cleartext payload available to it, so the server's entire
negotiation response has to be compressed into a single version
number.  It could be argued this is limiting, but it avoids parsing,
keeps things single, and maybe enforcing terseness is not a bad thing.
To make this feasible I suggested generous-sized version numbers:

-> Message1
<- Message2

Message1 and Message2 headers:
 - 4 bytes: version
 - 2 bytes: noise_message_len
 - " bytes: noise_message

All other message headers:
 - 2 bytes: noise_message_len
 - " bytes: noise_message

 * We'd like to bind the version numbers in case they have meaning to
an application beyond just selecting a Noise protocol.  But the only
mechanism in Noise for hashing arbitrary data is the prologue.  Thus
we bind client_version in the initial protocol, and if server_version
!= client_version we need to do a fallback protocol to bind the
server_version in the fallback prologue.  It could be argued that if
we had an "h" token for hashing arbitrary data we'd have more options.
For example if server_version indicates the same Noise protocol as a
different client_version but with different application semantics, we
could avoid the fallback and just bind the server_version by calling
MixHash(server_version).  But I'm not sure that gains much, and I'd
prefer building this out of the features we have.

Extending this to 0-RTT encryption requires solving a couple problems:
 (a) The 0-RTT message should probably bind all the negotiation data
from the cleartext initial message.
 (b) If the 0-RTT case isn't chosen, we should probably bind the 0-RTT
message in the server's response.

Due to (a) we probably can't fit the 0-RTT messsage inside the
client's cleartext initial payload.  Instead we can send it in
parallel, with the initial message's handshake hash used as prologue
for the 0-RTT message.

To support (b), we could handle all non-0-RTT cases as fallback,
including server_version = client_version, which lets us use the 0-RTT
message's handshake hash as prologue to the fallback protocol.  (An
alternative design would handle client_version = server_version as
before, without fallback, but buffering the initial cleartext payload
and deferring initializing the initial SymmetricState until the
client_version = server_version case is chosen and the 0-RTT message
is determined; but it seems better to minimize memory requirements).

So the end result is that the client could optionally send a 0-RTT
encrypted message alongside the cleartext initial message.  If a 0-RTT
message is not present, things are processed as before.  If a 0-RTT
message is present, then a particular server_version (!=
client_version) accepts the 0-RTT protocol.  Any other server_version
(including = client_version) is handled as a fallback, and includes
the handshake_hash from initial_message as well as zerortt_message in
the fallback prologue.

Message1 header:
 - 4 bytes: version
 - 2 bytes: zerortt_message_len
 - 2 bytes: initial_message_len
 - " bytes: initial_message
 - " bytes: zerortt_message

Message2 header:
 - 4 bytes: version
 - 2 bytes: noise_message_len
 - " bytes: noise_message

All other message headers:
 - 2 bytes: noise_message_len
 - " bytes: noise_message

One downside is that we'd have to add support for this into the Simple
1-RTT protocol now, if we want to be able to upgrade to 0-RTT support
later.

Is this making sense to people?  Anyone see better options?

Trevor