[messaging] Encrypted Group Chats

Thu Nov 27 20:24:48 PST 2014

This all side steps the core contention. The more interlocutors, the larger
profile for endpoint based attacks, and therefore the less security. If I
can break one device then all communications are compromised for the entire
group.

How is this supposed to be dealt with? Is this an intrinsic constraint? If
so, this needs to be communicated appropriately.
On 28/11/14 02:28, steve at actor.im wrote:
> It doesn't feel to be secure to share one common key across all members
of group. One of main plus of group is that we can easily check encryption
key for group.
>
> [..]
>
> Any ideas?

I'm going to attempt a summary of everything that's been discussed and
developed over the past year. I've written up some of these things in more
detail on github [1], but it's quite incomplete. If anything is unclear,
please ask. I'm not the best at writing summaries, especially since the
ideas themselves are *not exactly nailed down* at the moment. We will
hopefully be writing these down in more detail, in various documents, over
the next half-year or so, and also fleshing out more of these ideas.

There are two broad areas of work:

1. session establishment & membership control (key exchange)
2. session coherency (reliability, ordering, consistency)

(2) includes the idea that everyone should see the same set of messages.
Naively, even if the group shares a single key: if Charlie controls the
transport, then he can send different things to Alice and Bob. To prevent
this, Alice and Bob need to check they each got the same thing. This can be
thought of as "verifying recipients" - as opposed to "verifying the sender"
in the normal authentication problem.

We'll work under the premise that everyone directly authenticates each
other (assuming everyone knows valid long-term keys for everyone [2]), and
does not need to "trust" another member to authenticate a member. This is
basically "end-to-end" applied to *within* the group.

## Session establishment

We have been discussing two approaches to this - group key agreement, and
pairwise key agreement (everyone runs a standard key agreement between
themselves, i.e. n*(n-1)/2 agreements).

GKA is new and exciting, but all schemes proposed in the literature suffer
from one disability - they do not explicitly nor efficiently handle two
concurrent operations. That is, if the session has 3 members, and Alice and
Bob each decide to invite new members at the same time, we need to detect
this, co-ordinate this, and run two GKAs in sequence. This is unsuitable
for asynchronous and high-latency scenarios like email/sms. To be clear,
GKAs to tend to have a bunch of extra properties, but whether these are
actually practically important to security, is dubious. (e.g. "key
contributiveness")

Pairwise has the potential to support concurrent operation, and it's for
this reason that Trevor and I have both been looking into pairwise schemes,
and less into GKAs. No-one has made a firm proposal on the specifics yet,
though we're getting there surely. (There have been rough drafts in other
contexts, but nothing yet suitable to show publicly.)

Traditionally, pairwise has been disfavoured because it's viewed to be
inefficient, but actually all the GKAs proposed so far have been as
inefficient or worse. With pairwise, the inefficiency is O(n^2) in the size
of the group membership, but unaffected by the size of the message: encrypt
the message with a message-key, then encrypt the message key with each key
you share pairwise with another member. With other tweaks, one can make
this cost even less and we've discussed such schemes elsewhere, though
nothing concrete has been set down in stone yet.

Since we are working under the "end-to-end" or "trust least" principle as
mentioned before, it seems "intuitive" (hand-waving here, yes) that O(n^2)
is fundamentally the best you can do, so GKAs seem less important under
this argument. Perhaps someone will come up with a formal proof of this
some day.

I am certain that one can do better than O(n^2), if you are happy to trust
certain members of the group to do the authentication "via", but this is a
separate problem for another time (and I don't know of anyone directly
researching this).

Also, it may be that someone will propose a GKA in the future that *does*
explicitly support concurrent operation. But this seems way more complex
than just going with "pairwise" for the time being, so no-one has seriously
tried it yet.

For reference, the most efficient GKA (that achieves forward secrecy and
deniable authentication) that we have seen so far uses 3 rounds of
broadcasts. By contrast, pairwise key agreement with Triple-DH is
effectively 2 rounds of broadcasts. (Ignoring optimisation strategies like
pre-keys and piggybacking the first message, which you can do with any key
agreement scheme.)

## Session coherency

We have been discussing variations on one main approach - build up a
partially-ordered tree of message nodes, based on hashing each message, and
having each message explicitly point to immediate causal parents. This is
similar to how git and other DVCS these days, represent their data.

As the session progresses, you check that everyone else is building up the
same transcript graph that you're building (that their messages point to
yours; essentially "Alice has seen this message" delivery receipts, only
e2e authenticated). There are many more fiddly details to get right, which
I can go into elsewhere including [1], but the fundamental idea is the
transcript graph. With the upcoming eQualit.ie group IM protocol (that
Trevor and I have also helped with), we assume that the transport is in the
same global ordering for everyone. This makes the transcript data structure
a little bit simpler, but effectively it is still a partial order graph of
messages, with some additional structure that makes the graph "look
simpler".

There is also the issue of how strongly we should adhere to this data
structure - there was a thread a few weeks ago called "Group messaging
consistency under resource constraints". The key issue is that, to build up
the graph in the "most correct yet efficient" way involves sometimes
delaying some messages (to preserve ordering) and relying on a recovery
mechanism in the case of message loss. Some deem this to be against the
principles of asynchronous messaging; I disagree for various reasons. We
haven't yet reached consensus on this, but the last thread had a good
collection of the arguments that we need to condense down and revisit at
some point.

The other good thing about the transcript graph structure is that it gives
you a very concrete tool to reason about ordering semantics, which I've
found extremely helpful with e.g. the concurrent join problem, but that's
not something I can communicate very efficiently about via email - I'd have
to draw a load of diagrams. Maybe someday I will type up nice dot diagrams
for all the corner cases...

I can give more details about any of this if requested, and/or I'll also be
at 31C3 / RWC2015 to talk about this stuff in person.

X

[1] https://github.com/infinity0/msg-notes
[2] IMO this is a separate problem, to be solved outside of the messaging
layer.

--
GPG: 4096R/1318EFAC5FBBDBCE
git://github.com/infinity0/pubkeys.git

_______________________________________________
Messaging mailing list
Messaging at moderncrypto.org
https://moderncrypto.org/mailman/listinfo/messaging
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://moderncrypto.org/mail-archive/messaging/attachments/20141127/ac9b799b/attachment.html>