[messaging] Transparency for a key directory without publishing usernames

Wed Oct 8 15:16:42 PDT 2014

On Wed, Oct 8, 2014 at 4:26 AM, Joseph Bonneau <jbonneau at gmail.com> wrote:
>
> On Wed, Oct 8, 2014 at 2:58 AM, Trevor Perrin <trevp at trevp.net> wrote:
>>
>> Users could gossip the latest STH by including it in their encrypted
>> messages to each other.  Users would request proofs from the directory
>> that STHs are consistent, and that identity keys are consistent with
>> STHs.
>
>
> What would you check "consistency" with? I think the log can just publish
> signature(Merkle root of entire current data structure, epoch #) every
> epoch.

By consistency for STHs I just meant check that a newer STH is chained
from an older one (which might require requesting the intervening STHs
from the server).

That's not strictly necessary, if you assume that each epoch everyone
relying on a key will be connected to the parties monitoring that key
through gossip.

But if you check consistency for STHs, then if an attacker signs
different STHs for the same epoch they have to do this for all future
epochs, so gossip has a better chance of eventually detecting this.

> Users can request proofs on demand that their desired key material is
> correctly included every day at the hash of their username. Your software
> can automate this. Each proof would be 1 signature plus 256*256 bits of
> Merkle proof

I think it's log(# of users) * 256 bits, since the lower part of the
Merkle path / proof is constant.  So hopefully < 1 KB.

Leaking information about non-queried usernames is a concern.  A
Merkle path would reveal if there are usernames that exist with a hash
that shares a prefix of bits with you.  Maybe you could live with
that.

The sparse tree requires calculating 256 hashes to verify a Merkle
path, but most of them just chain the previous value with a constant.
Not a big deal, but seems like that could be optimized by just storing
each entry at a leaf determined by its minimum unique prefix?

> Your software can ping you if your data ever changes in a way it didn't
> expect. If you go offline, you can catch up (though you won't detect the
> attack until later obviously). Basically this is like monitoring except you
> can do it yourself locally. You could have an offline monitor check for you
> as well. The key point is that checking for N users takes work linear in N
> with no overhead from tracking the whole structure like you have in some
> other systems.

Yeah, I see that it's nice to monitor your entry just by doing lookups.

But with a more CT-like, publish-everything system, a 3rd-party
monitor could monitor a large N with potentially much less than N
work, since it only has to look at the delta of changed entries each
epoch.

So, that gets lost.

> 1) Self-enforcing security
>
> Digging into the Crypto Bag Of Tricks there's an interesting and potentially
> stronger approach: tagged one-time signatures [1]. With this approach, if
> the server ever signs two Merkle roots for the same epoch #, you can use the
> two signatures to re-construct their private key. You still need gossip to
> discover this, but this is a big disincentive to the server ever cheating.
>
> Perhaps this is a bad idea-it's a big loaded gun and the system will
> collapse spectacularly if the log screws up. Maybe we want the flexibility
> for a misbehaving log to explain itself publicly and be forgiven, or maybe
> we explicitly want a Sword of Damocles over their head to avoid them being
> able to do so.

Yeah, not sold on that.  You might want to give the log a chance to
explain itself.  The Sword of Damocles is always there anyways, we can
just stoping trusting the service.

You also want whoever detects the inconsistency to be incentivized to
publicize it, instead of keeping the (valuable!) private key for
themselves.

> 2) Key Directory vs. Short-lived certificates
>
> Two approaches are possible for privacy: in one version, the log functions
> as a key directory and will respond to anonymous queries for a given
> username. In this case you have to deal with DoS and people trying to
> brute-force to find usernames. Maybe privacy goes down.
>
> Alternately, it can function as a type of "certificate authority".

Makes sense.  But for asynchronous messaging you probably need a key
directory anyway.

> 3) Dealing with immediate issuance
>
> In the simple approach, you can't add new data to the log until the next
> epoch. Maybe we can drive this down to a matter of minutes (and users will
> accept data from a slightly out-of-date epoch). At some point the cost of
> downloading your proofs at every epoch becomes prohibitive, which probably
> means that you can't immediately download the chat app, make a new username,
> and start talking. That might be the "fatal flaw" in practice.

I think that's an issue with any transparency log-type system, not
just this approach.

So it probably always has to be a "post-facto" check.  I.e., you use
the key on day N, and lookup the proof on day N+1.  And since you're
recomputing a fairly large data structure, and you want people to
check consistency between STHs, I think it will be hard to make the
epochs too small, a day feels about right to me...

Trevor