[messaging] DP5 - attempt at summary
Trevor Perrin
trevp at trevp.net
Fri Jun 13 01:06:54 PDT 2014
Great paper by Borisov, Danezis, and Goldberg:
http://cacr.uwaterloo.ca/techreports/2014/cacr2014-10.pdf
I'm going to attempt a summary, it's not very good but it might help
read the paper to know where it's going:
Imagine you had an unobservable chat system; maybe it runs over Tor so
an observer can't tell who you're talking to. You'd also like a
"presence system" that informs you when your friends are available to
chat. But you don't want to reveal to the presence system your list
of friends, or when you're online.
At a high level, this could be done if:
- Each pair of friends share symmetric keys.
- When users are online, they anonymously publish encrypted status
messages to each of their friends through some "Private Information
Retrieval" (PIR) system, tagging these messages with an ID derived
from the pairwise symmetric key and the current time period (or
"epoch").
- Users frequently query the PIR system to retrieve the presence
messages being sent from their friends. By definition of "PIR",
queries don't reveal which element is being queried, thus preserving
privacy.
DP5 show how to make this real by:
(1) Creating a practical PIR system for lookup of messages based on
their ID (i.e. a key:value store).
(2) Reducing traffic volume by using pairwise-encrypted messages to
distribute signature-verification keys for signed-and-broadcast
messages (long-term vs short-term "epochs")
(3) Using some sort of unlinkable signature scheme so that your
signed messages in each short-term epoch can be retrieved by your
friends but can't be linked by the server.
PIR
----
Section 4.3 discusses a practical and implemented multi-server PIR
system. (Recall that multi-server PIR means the client queries
several servers who each have the same database. None of the servers
learn which entry is being queried as long as one server is "honest"
and doesn't collude with the others. Single-server PIR would be
better, since it doesn't require an honest server, but multi-server
PIR is more practical).
Building from a simple Chor / Pynchon Gate-style multi-server PIR,
there's discussion of adding robustness against misbehaving servers,
then using the system as a hash table for (key:value) pairs.
Long-term / Short-term epochs
----
It's inefficient to publish a separately-encrypted message to each of
your friends every few minutes. Instead you could send each of them
your signature-verification key and a symmetric key, and then publish
a *single* signed-and-encrypted message, in each time period, that all
your friends can read.
But this loses the ability to revoke friends, so DP5 strikes a balance
- users send pairwise-encrypted messages to each friend perhaps once a
day. These messages establish the signature-verification and
symmetric keys for a "long-term epoch". Within the long-term epoch,
users publish signed-and-encrypted messages for each "short-term
epoch".
Using pairwise messages to establish keys for broadcast messages has
been proposed in other systems (mpOTR, [GRP]), and seems like a good
pattern. But...
Signatures
----
Some trickiness creeps in. The threat model here gets complicated,
but one goal is that you don't want people to interfere with your
messages by publishing their own messages with colliding IDs. Since
the long-term epoch messages are based on pairwise symmetric keys,
this is easy: you derive each message ID by hashing the shared key and
epoch number.
But the short-term messages are being broadcast to all your friends,
so you can't rely on a symmetric-key mechanism as any of your friends
could forge it. And you can't use the signature-verification key as
an ID either, as this would link your presence messages across
short-term epochs, so the server could see when you are online (even
if it can't tell exactly who you are).
The paper has two proposals:
(A) Pairing-based signatures. Roughly:
If g is a Diffie-Hellman generator, given g^a and g^b you shouldn't be
able to compute g^ab. But suppose g generates a special DH group,
such that h is a generator of some other group, and you can compute
the "pairing" function e(g^a, g^b) = h^ab.
Then you can do things like BLS signatures [BLS] where you sign
message m with private-key x by calculating m^x. Verification uses
public key g^x and checks that e(g, m^x) = e(g^x, m).
By setting m to the epoch number you can sign this number to get m^x,
send the signature to the server, and it will use e(g, m^x) as the
message ID. Your friends can calculate the message ID via your
public-key and epoch number = e(g^x, m), but they can't construct
interfering messages for new epochs, as that would require a signature
forgery.
The downside of this is that it only signs the epoch number, so the
signature authenticates your presence but not any "additional data"
you're sending in the message, which is only symmetrically
encrypted/authenticated.
(B) The alternative in Appendix A is to use a conventional
discrete-log signature (Ed25519 is suggested), but derive a "variant
of the public key" for which the signer can derive a corresponding
variant private key. The variant public key is then used as the
message ID, which allows the message to be signed.
The use of more normal cryptography and the more complete signature
makes (B) seem appealing to me, but I don't fully understand this
tradeoff.
Anyways, there's a lot more to this, so read the paper!
Trevor
[GRP] https://whispersystems.org/blog/private-groups/
[BLS] http://www.iacr.org/archive/asiacrypt2001/22480516.pdf
More information about the Messaging
mailing list