[messaging] Matrix.org: Decentralized group chat
erik at matrix.org
Wed Mar 11 04:22:40 PDT 2015
On Sunday, March 8, 2015 00:13 GMT, infinity0 at pwned.gg (Ximin Luo) wrote:
> For a private dynamic group chat, the merging requirements are not exactly the same as in a collaborative environment like Google Wave:
> - for a basic application, one does not *need* to merge arbitrary state. The only thing you need to merge is an unordered set (the membership set).
> - merging unordered sets is easy and a solved problem if you have full history. However, the fact that this set represents "who has permission to see what history" makes the problem hard again, in the context of the other constraints (secure, casually-consistent, decentralised). Merge algorithms require history, but if someone can't see past history (e.g. after joining) then they don't have enough information to execute the merge algorithm, nor to verify others' claimed merges. (This is not necessarily an impossible problem, and indeed is the "last missing piece" in the bundle of ideas I haven't yet written down.)
> Briefly skimming over their spec: http://matrix.org/docs/spec/ "The state of the room at a given point is calculated by considering all events preceding and including a given event in the graph. Where events describe the same state, a merge conflict algorithm is applied."
> My impression too is that they underestimate how hard this will be. Also, their security practises are questionable - they do have a list of threats in the spec, but there are no suggestions or even hints on techniques on how they will defend against these threats. I guess they will add this stuff in later, but as I'm sure everyone on this list knows, you can't just "add security later". It sounds like they just did the equivalent of adding partial ordering to a federated protocol similar to XMPP, using TLS "for security", and hoping that everything will be naturally straightforward to conceptually lock down as they write the code.
> GPG: 4096R/1318EFAC5FBBDBCE
I've been working on a lot of the federation side of Matrix since its
inception, and I can assure everyone that we really do know how hard
these sorts of generic merge algorithms are. Hence why we've promptly
not tried to solve them.
What we have done is attempted to solve a much simpler subset of the
general problem. The basic requirement is along the lines of: "All
servers in the group chat (that can talk to each other) will
/eventually/ agree what the state of the room is." Note the careful
omission of the word "correct".
(The state of the room, or more accurately the state of the room at any
given event according to a given server, is simply a mapping between
keys and nodes/events in the directed acyclic graph. The key is included
in the event.)
The main points to note are:
- We differentiate "auth events", i.e. events that affect
authorization of "new" events. There are very few types of auth events,
and Matrix doesn't support custom auth events.
- The merge algorithm of non auth events, as Ximin points out, is
- The merge algorithm for auth events is designed so that maliciously
branching and re-merging does not allow circumvention of auth changes.
This is the hardest part of merging, and is heavily special cased.
- Each event points to the auth events that allow that event to
happen, separately to the full event DAG.
- All events are signed by the originating server, and all event
relationships are protected by hashes (à la git).
- Events can be "redacted" by servers, where all non protocol relevant
keys are stripped, e.g. message contents. This allows servers to
effectively "delete" nasty messages without leaving a whole in the
graph. This is possible since we include a hash of the full event, but
servers only sign the hash and protocol relevant keys rather than the
- Every server is required to store the full auth chain of the events
it sends (for at least as long as it stores the original event). The
auth chain is simply all the auth events for that event, and the auth
events for those, etc.
- The auth chain is enough for a server to accept that an event could
- Servers can tell each other if they think another server shouldn't
have accepted an event and why. (For example the sender of the event had
been kicked between, in topological terms, the join event and the
- When a server joins the room, it gets given (by the server it is
joining via) the other servers view of the current state and associated
- Servers can throw away old history/events. They still have to keep
all the auth events.
Thus, different servers can have different views of what the current
state is if buggy or malicious servers are involved; for example, when
joining a room your server can be given an incorrect view of the current
state. However, the auth chain ensures that these views can only be
incomplete or stale, servers cannot simply include arbitrary synthesized
If two servers discover that they disagree on the state (which they
quickly will), then they will exchange their views and attempt to prove
the other side wrong; by the end of this process two correctly
functioning servers will come to an agreement of the current state. In
effect providing a "healing" mechanism.
(This does mean that servers may initially accept an event, only to be
subsequently shown that the event should have been rejected.)
The outcome of all this is (hopefully) that the subgroup of correctly
functioning servers *will* eventually agree on the current state, and so
which events should accepted and rejected.
Hopefully this clarifies a little what Matrix is trying to do and what
the federation protocol does and doesn't aim to provide and guarantee.
More information about the Messaging