[messaging] MPHFs for public key lookup?

Daniel Kahn Gillmor dkg at fifthhorseman.net
Mon Jun 29 21:58:44 PDT 2015

On Mon 2015-06-29 16:32:34 -0400, Jason A. Donenfeld wrote:
> That's a little more efficient. But I don't like that memcmp in there.
> So let's throw everything in a hash table instead. But actually, we
> can optimize that even further: let's generate a function, mphf(key),
> that maps public keys 0..n to integers 0..n, in any order, but
> utilizing all the integers. This is called a minimal perfect hash
> function. The idea is we compute the mphf function once, and then we
> can simply do this for each incoming packet:
> index = mphf(msg.public_key);
> if (index == -1)
>     return NULL;
> client = clients[index];
> key = DH_POINT_MULTIPLY(server private key, clients[i].public_key,
> clients[i].some_other_shared_param);
> if (AEAD_DECRYPT(ciphertext, key) == success)
>     return client
> return NULL;
> So my question is: is anyone aware of this being done anyplace?
> (Anyone have a nice standalone C implementation of an MPHF generator?)
> Or is this sort of thing usually implemented with different tricks?

I'm not sure i understand the advantage here -- an properly-indexed
datastructure of all known keys should take something like O(log(n)) to
search efficiently.  how many keys are we talking about here?  If it's a
small number of keys, even the linear search you describe above seems
fine to me (these are not large objects), and if it's a large number of
keys, even log(n) comparisons seem unlikely to be terribly expensive.

Worse, the hypothetical MPHF you describe seems like it would make your
keyset rather brittle.  Adding a new key to the mix would require
reassigning roughly half the associations, which means re-shuffling your
list of clients.  If you've got N users where N is large (i can't
imagine caring about this sort of optimization if N is small), that
means each time a user adds or removes a key, or you add or remove a
user, there is a O(N) cost as you re-shuffle the table.

What is the case where this optimization is valuable?


More information about the Messaging mailing list