[messaging] Replacing group signatures with HMAC in Pond.

Mon May 26 17:28:36 PDT 2014

I've had too little time so far this year to work on Pond. Today is really
the first day in a quite a while that I've had to think about these things
and I'm afraid that I'm back to work tomorrow.

Since Pond isn't so small a code base these days, I can't quite pick it on a
whim and progress it. So I'm writing up extended TODOs in the hope that, by
caching my thought processes in a more concrete form, I can reduce my warmup
time and better use the small fragments of time that I have.

Replacing group signatures with HMAC:

Pond's very low bandwidth means that it's very vulnerable to spam and
flooding.  It seeks to solve this by requiring authentication before messages
can be sent to a recipient. This is implemented with BBS group signatures:

During the introduction process, each of the two parties in a contact
relationship end up with a member private key for the other's group. The home
server of a given user holds the group public key and checks the signature on
all attempted deliveries. The recipient user can use the group private key
to disclose the identity of any sender and so identify and revoke any
misbehaving contacts. The group signature also allows the recipient to
discover if their home server is accepting messages that it shouldn't be
(i.e. spamming the user).

There are a couple of drawbacks with this that were obvious from the start:

The BBS scheme is complex and uncommon. There are no canonical
implementations.  Pond uses a custom pairings and BBS implementation,
although can also be linked against dclxvi, a C/asm pairings implementation
by Naehrig, Niederhagen and Schwabe.

The BBS scheme is quite slow (although a fair bit of the blame falls on my
implementation). Since servers have to perform verification operations on
untrusted inputs from the network, this leads to a DDoS weakness.

The biggest drawback only became clear after some use:

Since Pond emphasises erasure, keeping member keys around after the user has
"deleted" a contact is unacceptable because it's evidence of communication
should the other party be compromised.

Thus, when removing a contact, their group member key needs to be revoked. If
member keys are left unaccounted for and untracked then any future abuse is
unstoppable since the member key is an input to the revocation calculation.

A "verifier local" revocation system would allow revocations to only involve
the revoking user and their home server, but would also allow the home server
to replay previous messages and link all those that were from the, now
revoked, contact. (Because the signatures no longer validate.) Thus a global
revocation system has to be used that involves all member key holders
updating their keys.

However, the rate of revocation in practice is significantly higher than I
guessed during design. People setup throwaway accounts for specific trips,
replace state files and rerun the introduction protocol often, etc. This has
introduced a lot of complexity and delays. Messages can bounce because of a
group signature failure from of a pending revocation. This has to be
handled and the message resigned. Because the number of revocation is so
high, they have had to be batched by the server to avoid multiple bounces for
the same message.

There is still an outstanding bug due to this in the code: if the
introduction protocol is run while revocations are pending then the new
contact gets a member key for the current group, not the group that the
server knows about.  Any messages sent by the contact before the revocation
updates reach the server are rejected. Solving this would involve even more
complexity: the client would have to keep a history of revocations until they
have been accepted by the server and would have to run the introduction
protocol with a historic group.

The HMAC design:

This is substantially thanks to Trevor, who wrote it up in
https://moderncrypto.org/mail-archive/messaging/2014/000262.html

Rather than using group signatures, a user and their home server share a HMAC
key, k. During the introduction protocol, a number of pairs are exchanged.
Each pair contains a private key, x, and HMAC(k, y), where y is the public
key for x.

Delivery attempts contain (msg, y, HMAC(k,y), sig(msg, x)). The server can
check the HMAC because it holds the HMAC key. The recipient can find who
sent a message because they know who a given public key was given to. A
sender's pool of pairs is refreshed in the normal exchange of messages.

Once accepted, a server records HMAC(k, y) and rejects it in the future.

Now a revocation involves sending a list of HMAC(k, y) values to the server
to be "crossed off". This storage grows without bound, but very slowly. All
the problems with messages bouncing and needing to be resigned are solved. So
is the problem of having pending revocations when performing the introduction
protocol.

Additionally, it's very cheap for the server to verify. It also handles the
problem of replays which, at the moment, is a case where the recipient can't
differentiate between contact and home server misbehaviour.

The new design also eliminates the ability for all contacts to observe
revocations, which is nice. Ambiguously, it stops a contact from sending
messages without limit because they will run out of pairs. There must be a
return flow of messages to refresh tokens.

Some details need to be handled:

It's possible for a delivery to be interrupted after the server has accepted
the message, but before the sender has received confirmation of that. The
sender will retry the transmission and, in the new scheme, will be rejected
by the server. I think that the server needs to keep two lists of "used"
HMACs: one for actually used HMACs and one for revoked HMACs. It's important
that a home server not mistakenly tell a contact that it has been revoked.

A client should know how many outstanding pairs each of its contacts has and
thus be able to maintain exactly the level it wishes. However, if messages
get lost (e.g. because a home server has a disk failure) then that estimate
may become permanently biased.

We have to be careful that any adjustment mechanism doesn't allow a contact
to slowly build up a large number of pairs. I think the best answer here
might be to rework the queues in the current code to ensure that messages are
not reordered. An out-of-order message is then a strong signal that a message
has been lost. A recipient can send the HMAC value for a "lost" message to
the server to be struck off (on the non-revoked list) and then the normal
refill process will fix the problem.

If a sender's entire pair supply is lost, they are stuck. They can't send
another message to prompt a refill. However, one assumes that this extreme
case only happens if a home server is being malicious and a malicious home
server can already drop messages.

Cheers

AGL