[messaging] Recognizing senders of metadata-hidden messages

Mon Jul 27 23:07:16 PDT 2015

On 7/24/15 1:09 AM, Jeff Burdges wrote:

> I believe the distinction in Vuvuzela between dialing contacts you
> haven't contacted recently, and conversations as bursts of messages
> should be relevant to token distribution.
> 
> I think Vuvuzela used a unique mailbox for each conversation, making
> trial decryption far more viable in that case.  I'd imagine one could
> evolve the conversation mailbox address with the ratchet state if
> desired.  

Ah, interesting, so there could be one protocol for creating a "room" or
a "mailbox", and a separate one for maintaining the tokens necessary for
a given room. There are probably fewer rooms than potential senders, so
trial decryption is easier. I guessing it'd feel more "connection
oriented": there's a distinct UI and/or network thing that happens when
you establish a new conversation.

Since Trevor suggested that we might combine the delivery token with the
ratchet, I went ahead and wrote up what an all-token protocol would look
like, where each (single-use) token is used both for mailbox delivery
and recipient key selection:

 http://www.lothar.com/blog/53-petmail-delivery/

It assumes Curve25519 keypairs are cheap, and tries to minimize each
side's obligation to model the other's ratchet state ("did they see my
new key yet? can I get rid of my old privkey yet?"). The basic flow
would be:

* Recipient maintains a set of a few thousand Curve25519 keypairs. For
  each one, they derive privkey -> pubkey -> HMAC key -> HKID (mostly by
  hashing), and remember a table that maps HKID->(senderid, privkey).
  Each is single-use, and it creates more to replace them as they get
  used up.

* Recipient gives the HMAC keys to the Mailbox server, keeping it
  up-to-date as new ones are created. Mailbox maintains a table mapping
  HKID->(HMAC key, recipient).

* Recipient gives some pubkeys to each sender, keeping them stocked with
  maybe 20 at a time.

* Each time the sender creates a message, they derive the HMAC key and
  HKID, encrypt their message (with an ephemeral Curve25519 keypair,
  attaching the ephemeral pubkey to the message), append the HMAC tag,
  prepend the HKID, then send the result to the mailbox server. The
  mailbox looks up the HKID to get the HMAC key, validates the HMAC, and
  enqueues the whole message to the recipient. The recipient fetches the
  queued messages, looks up the HKID to find the sender and privkey,
  derives and validates the HMAC, then decrypts the message and destroys
  the (HKID, privkey) pair.

The mailbox server has to track a lot of tokens, but the actual code is
pretty simple, and the forward-security window is ideal (one key per
message, which is destroyed as soon as the message is received). There
are a lot of things you could tweak to reduce the admin overhead (maybe
only give a few tokens to idle senders, then ramp it up once the
conversation gets interesting, etc). On the downside, if your recipient
is offline for long enough, you lose the ability to send them more
messages, sort of a voicemail box getting filled up and rejecting calls.
I'm not sure this is actually a bad thing, but it certainly feels weird.

I think I'm going to implement this in Petmail and see how it feels once
I've got running code.

Oh, I've got another blog post about flavors of anonymity/linkability
too, in case anyone has feedback:

 http://www.lothar.com/blog/52-linkability/

cheers,
 -Brian