[noise] Simple 1-RTT protocol strawman

Sat Jun 24 22:12:19 PDT 2017

Hi all,

We've been tossing around ideas for a user-friendly 1-RTT protocol
based on Noise_XX.  I think the goals are:

 * An easy-to-use starting point for working with Noise.

 * An alternative to 1-RTT protocols like TLS and SSH that is simpler
and more efficient.

I think the unresolved issues are padding, versioning/negotiation, and
the name.  I'll suggest some resolutions, leading to a "strawman"
proposal as a basis for further discussion.

Padding
========
We've discussed whether padding should be a responsibility of the
application or this protocol.

I'll propose we handle it in the protocol, because:

 * Having an API will encourage application designers to use it.

 * If padding is required to be zero-filled by sender, and ignored by
recipient, then it becomes a reserved area which could be useful.
Suppose an application wants to transmit out-of-band data inside of
payloads (e.g. requesting rekey, or sending keep-alives).  If the
application has "painted itself into a corner" and didn't use an
extensible data format for its payloads (like JSON, Protobufs, etc),
then data can be stuffed into the "padding".

 * If we allow padding in the ClientHello's cleartext payload, then
the previous bullet could be particularly useful for
versioning/negotiation (see next section).

So every payload's plaintext - including the ClientHello cleartext -
could contain a "body" followed by padding:
 - 2 bytes: body_len
 - " bytes: body
 - ? bytes: padding

Versioning and negotiation
===========================
If clients only support a single version, then versioning is easy:
clients just send a version indicator, and servers accept or reject.

Things get complicated when clients support multiple versions, or
multiple orthogonal options:

 * Client have to advertise what they support.  This could be as
simple as (min_version, max_version), or as complicated as a list of
TLS-style extensions.

 * In addition to advertising what they support, clients might want to
send associated data.  Examples are:
   - Different ephemerals if the client is advertising different curves
   - 0-RTT encrypted payload(s)
   - The server name the client is contacting (like TLS SNI)

 * Servers have to indicate which client options they are accepting.
Again, this could be as simple as a single version field, or as
complicated as a list of extensions.

 * The simple protocol could choose not to have a special place for
advertisements / associated data, so the application has to transmit
them inside the initial cleartext payload.

So there's a lot of options, for example:

SINGLE VERSION
 C->S: version

SERVER CHOICE
 C->S: client_version
 C<-S: server_version

SERVER CHOICE WITH CLIENT RESERVED AREA
 C->S: client_version, client_reserved
 C<-S: server_version

CLIENT RANGE WITH SERVER CHOICE
 C->S: client_version, client_max_version
 C<-S: server_version

MUTUAL EXTENSIONS
 C->S: list of client_extensions
 C<-S: list of server_extensions

I think we can rule out SINGLE VERSION, because we want the ability
for clients to advertise and servers to choose.

I would also rule out MUTUAL EXTENSIONS.  I think that ends up in two
different but undesirable places.  Either:  A centrally-managed set of
extensions, like TLS, making the protocol difficult to evolve and
customize; or extension lists exposed through an API, which is
probably more parsing and API complexity than we want.

I'll argue for the middle option: SERVER CHOICE WITH CLIENT RESERVED
AREA.  This allows the client to easily send it's minimum version in
client_version, and it can advertise and send associated data in the
cleartext payload.  If the application designer forgets to make the
cleartext payload extensible, they still can advertise via some sort
of client "reserved "area".

Given this, what should versions and the reserved area look like?  I'd
suggest a generous 32-bit size for versions, and giving the
application full control over its contents.  The server_version might
have to encode a large number of choices, and it could be useful for
the application to subdivide client_version or server_version, e.g.
client_version = (max_version, min_version), or server_version = (dh,
cipher, hash).

The padding space in the cleartext payload could be used for the
client_reserved area.

It's a little weird to overload an encrypted-padding mechanism as
cleartext-reserved space.  But one problem with reserved / extension
mechanisms is that they don't get tested so don't work when you need
them.  The padding mechanism is more likely to be tested and
exercised.

Anyways, proposed message headers:

ClientHello and ServerAuth:
 - 4 bytes: version
 - 2 bytes: noise_message_len
 - " bytes: noise_message

Other messages
 - 2 bytes: noise_message_len
 - " bytes: noise_message

Versioning and fallback
========================
Suppose the client is initialized with Noise protocol X, then the
server chooses Noise protocol Y.  I think we'll want to handle this as
XXfallback, where the client uses the handshake hash from X as a
prologue for new protocol Y.  (The alternative is that the client
re-initializes with protocol Y and simulates sending the same initial
message, but that requires buffering the whole initial message,
instead of just the handshake hash).

So we'll have to reflect that in the API (see later).

Naming
=======
NoiseSocket, NoiseTransport, and NoiseLink have been proposed.

I think NoiseSocket implies a byte-stream API, but we should probably
have a message API, so that the caller can control buffering.  Also,
it's a better name for an API object, rather than a protocol.

NoiseTransport clashes with the Noise "transport phase" and "transport
messages".

We probably can't use names like "tubes" or "conduits", because
they're too close to "Pipe".

I'm liking "NoiseLink" because "link" is a small and simple word for a
small and simple protocol.  But Alexey thinks it sounds like a techno
group.

I'm still leaning towards it, but any other ideas?

Protocol
=========

Pulling this together, we have:

Noise_XX:
 -> e
 <- e, ee, s, es
 -> s, se

Message names:
 -> ClientHello
 <- ServerAuth
 -> ClientAuth

ClientHello and ServerAuth headers:
 - 4 bytes: version
 - 2 bytes: noise_message_len
 - " bytes: noise_message

Other message headers:
 - 2 bytes: noise_message_len
 - " bytes: noise_message

All payloads:
 - 2 bytes: body_len
 - " bytes: body
 - ? bytes: padding

Prologue is different for the initial versus fallback case:

"NoiseLinkInit" || client_version

"NoiseLinkReinit" || client_version || server_version ||
preceding_handshake_hash

We'll use the second (fallback) case whenever server_version != client_version.

The recommended API has a "session object" with the following methods.
Note that the reserved/padding contents are not accessible through the
API.  We want to discourage their use, except for emergencies.

The 'padded_len' parameter specifies the simulated length that the
encrypted plaintext will be padded to, so 65517 is the max value:
65535 (noise_message_len) - 16 (for authentication tag) - 2 (for
padding_len)

Client functions
-----------------
The client calls these in sequence.  If the client only supports a
single version, it skips PeekServerAuth and ReinitializeClient.  If it
supports multiple versions but PeekServerAuth returns a server_version
== client_version it can skip ReinitializeClient.  The client can use
a specified ephemeral key pair in Reinitialize, e.g. if the client
sends multiple different ephemerals in the cleartext body.

InitializeClient
 INPUT: client_version, dh, cipher, hash
 OUTPUT: session object

WriteClientHello
 INPUT: [cleartext_body]
   - cleartext_body is zero-length if omitted
 OUTPUT: client_hello_message

PeekServerAuth
 INPUT: server_auth_message
 OUTPUT: server_version

ReinitializeClient
 INPUT: server_version, dh, cipher, hash[, new_client_ephemeral_key_pair]
   - If server_version != client_version, fall back is used
 OUPUT: updated session object

ReadServerAuth
 INPUT: server_auth_message
   - Errors if server_version does not match session version
 OUTPUT: server_public_key, server_auth_body

WriteClientAuth
 INPUT: [client_key_pair], [client_auth_body], [padded_len]
   - client_key_pair is randomly generated (dummy) if omitted
   - client_auth_body is zero-length if omitted
   - padded_len is zero (no padding) if omitted
 OUTPUT: client_auth_message

Server functions
-----------------
The server calls these in sequence.  If the server only supports a
single version, or ReadClientHello returns client_version ==
server_version, then it skips ReinitializeServer.  The server can use
a specified client ephemeral public key in Reinitialize, e.g. if the
client sends multiple different ephemerals in the cleartext_body.

InitializeServer
 INPUT: server_version, dh, cipher, hash
 OUTPUT: session object

ReadClientHello
 INPUT: client_hello_message
 OUTPUT: client_version, cleartext_body

ReinitializeServer
 INPUT: server_version, dh, cipher, hash[, new_client_ephemeral_public_key]
   - If server_version != client_version, fallback is used
 OUTPUT: updated session object

WriteServerAuth
 INPUT: [server_key_pair], [server_auth_body], [padded_len]
   - Errors if server_version != client_version and no re-initialization
   - server_key_pair is randomly generated (dummy) if omitted
   - server_auth_body is zero-length if omitted
   - padded_len is zero (no padding) if omitted
 OUTPUT: server_auth_message

ReadClientAuth
 INPUT: client_auth_message
 OUTPUT: client_public_key, client_auth_body

Functions for both
-------------------
After WriteClientAuth / ReadClient, both parties can call Write and Read:

Write
 INPUT: transport_body[, padded_len]
   - padded_len is zero (no padding) if omitted
 OUTPUT: transport_message

Read
 INPUT: transport_message
 OUTPUT: transport_body

Thoughts?

Looking at the API, it's still kind of low-level.  But perhaps it's
easier to wrap into a high-level API then raw Noise?

Trevor