[messaging] Encrypted Group Chats
infinity0 at pwned.gg
Thu Nov 27 20:08:50 PST 2014
On 28/11/14 02:28, steve at actor.im wrote:
> It doesn't feel to be secure to share one common key across all members of group. One of main plus of group is that we can easily check encryption key for group.
> Any ideas?
I'm going to attempt a summary of everything that's been discussed and developed over the past year. I've written up some of these things in more detail on github , but it's quite incomplete. If anything is unclear, please ask. I'm not the best at writing summaries, especially since the ideas themselves are *not exactly nailed down* at the moment. We will hopefully be writing these down in more detail, in various documents, over the next half-year or so, and also fleshing out more of these ideas.
There are two broad areas of work:
1. session establishment & membership control (key exchange)
2. session coherency (reliability, ordering, consistency)
(2) includes the idea that everyone should see the same set of messages. Naively, even if the group shares a single key: if Charlie controls the transport, then he can send different things to Alice and Bob. To prevent this, Alice and Bob need to check they each got the same thing. This can be thought of as "verifying recipients" - as opposed to "verifying the sender" in the normal authentication problem.
We'll work under the premise that everyone directly authenticates each other (assuming everyone knows valid long-term keys for everyone ), and does not need to "trust" another member to authenticate a member. This is basically "end-to-end" applied to *within* the group.
## Session establishment
We have been discussing two approaches to this - group key agreement, and pairwise key agreement (everyone runs a standard key agreement between themselves, i.e. n*(n-1)/2 agreements).
GKA is new and exciting, but all schemes proposed in the literature suffer from one disability - they do not explicitly nor efficiently handle two concurrent operations. That is, if the session has 3 members, and Alice and Bob each decide to invite new members at the same time, we need to detect this, co-ordinate this, and run two GKAs in sequence. This is unsuitable for asynchronous and high-latency scenarios like email/sms. To be clear, GKAs to tend to have a bunch of extra properties, but whether these are actually practically important to security, is dubious. (e.g. "key contributiveness")
Pairwise has the potential to support concurrent operation, and it's for this reason that Trevor and I have both been looking into pairwise schemes, and less into GKAs. No-one has made a firm proposal on the specifics yet, though we're getting there surely. (There have been rough drafts in other contexts, but nothing yet suitable to show publicly.)
Traditionally, pairwise has been disfavoured because it's viewed to be inefficient, but actually all the GKAs proposed so far have been as inefficient or worse. With pairwise, the inefficiency is O(n^2) in the size of the group membership, but unaffected by the size of the message: encrypt the message with a message-key, then encrypt the message key with each key you share pairwise with another member. With other tweaks, one can make this cost even less and we've discussed such schemes elsewhere, though nothing concrete has been set down in stone yet.
Since we are working under the "end-to-end" or "trust least" principle as mentioned before, it seems "intuitive" (hand-waving here, yes) that O(n^2) is fundamentally the best you can do, so GKAs seem less important under this argument. Perhaps someone will come up with a formal proof of this some day.
I am certain that one can do better than O(n^2), if you are happy to trust certain members of the group to do the authentication "via", but this is a separate problem for another time (and I don't know of anyone directly researching this).
Also, it may be that someone will propose a GKA in the future that *does* explicitly support concurrent operation. But this seems way more complex than just going with "pairwise" for the time being, so no-one has seriously tried it yet.
For reference, the most efficient GKA (that achieves forward secrecy and deniable authentication) that we have seen so far uses 3 rounds of broadcasts. By contrast, pairwise key agreement with Triple-DH is effectively 2 rounds of broadcasts. (Ignoring optimisation strategies like pre-keys and piggybacking the first message, which you can do with any key agreement scheme.)
## Session coherency
We have been discussing variations on one main approach - build up a partially-ordered tree of message nodes, based on hashing each message, and having each message explicitly point to immediate causal parents. This is similar to how git and other DVCS these days, represent their data.
As the session progresses, you check that everyone else is building up the same transcript graph that you're building (that their messages point to yours; essentially "Alice has seen this message" delivery receipts, only e2e authenticated). There are many more fiddly details to get right, which I can go into elsewhere including , but the fundamental idea is the transcript graph. With the upcoming eQualit.ie group IM protocol (that Trevor and I have also helped with), we assume that the transport is in the same global ordering for everyone. This makes the transcript data structure a little bit simpler, but effectively it is still a partial order graph of messages, with some additional structure that makes the graph "look simpler".
There is also the issue of how strongly we should adhere to this data structure - there was a thread a few weeks ago called "Group messaging consistency under resource constraints". The key issue is that, to build up the graph in the "most correct yet efficient" way involves sometimes delaying some messages (to preserve ordering) and relying on a recovery mechanism in the case of message loss. Some deem this to be against the principles of asynchronous messaging; I disagree for various reasons. We haven't yet reached consensus on this, but the last thread had a good collection of the arguments that we need to condense down and revisit at some point.
The other good thing about the transcript graph structure is that it gives you a very concrete tool to reason about ordering semantics, which I've found extremely helpful with e.g. the concurrent join problem, but that's not something I can communicate very efficiently about via email - I'd have to draw a load of diagrams. Maybe someday I will type up nice dot diagrams for all the corner cases...
I can give more details about any of this if requested, and/or I'll also be at 31C3 / RWC2015 to talk about this stuff in person.
 IMO this is a separate problem, to be solved outside of the messaging layer.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 819 bytes
Desc: OpenPGP digital signature
More information about the Messaging