[messaging] Mixmaster Protocol Design

Wed Jul 16 18:38:34 PDT 2014

On Wed, Jul 16, 2014 at 4:08 PM, Tom Ritter <tom at ritter.vg> wrote:
> On 16 July 2014 16:18, Trevor Perrin <trevp at trevp.net> wrote:
>> On Wed, Jul 16, 2014 at 12:20 PM, Tom Ritter <tom at ritter.vg> wrote:
>>> On 15 July 2014 21:14, Trevor Perrin <trevp at trevp.net> wrote:
>>>> The rest of the changes seem like a failed attempt to prevent tagging
>>>> attacks via integrity protection.
>>>
>>> Why do you say it fails?  If each Mix Header authenticates the next
>>> (as opposed to each header authenticating every single header), when a
>>> message transits an attacker-uncontrolled node, it will be discarded
>>> as the next header was corrupted. (Each header also needs to
>>> authenticate the body.)
>>
>> The message won't be discarded if a later header is tagged.
>
> If I tag Header 5, I can't recognize the tag unless I'm the recipient
> for Header 5.  If I'm not the recipient for Header 4, the recipient at
> Header 4 will discard it because it was modified.

I.e. the recipient for Header 4 can recognize that the message was
tagged.  Which is a tagging attack.

>>> What's more, I think if you authenticate every header in every header,
>>> you disclose the path length. You can't authenticate a random header
>>> added at the end in the next hop, so when you receive a message that
>>> only authenticates 17 of the headers, you know where you are in the
>>> chain.
>>
>> The "random header added at the end" should be
>> deterministic-but-random-looking so it can be MAC'd yet can't be
>> distinguished from a real header.  That's what I was trying to
>> describe in my last mail.
>
> We may be talking past each other.  This is what I'm seeing the problem is:
>
> Client Constructs twenty headers (let's say 5 of them are real, 15
> fake) and computes a MAC for each of them and enclose these MACs in
> the first 5 headers.

I understand what you're saying, but it's solvable.

> Remailer 1 verifies the MACs on Headers 1-20, then tacks on a random
> header at the end, and sends it off

Instead of doing that, Remailer 1 should tack on a
deterministic-but-random-looking header at the end, with contents that
are predictable by the initial client so that it matches the MACs for
later remailers, but later remailers can't tell whether it's a real or
fake header.

E.g. in Mixminion, I think Remailer 1 pretends the new header has a
ciphertext of zeros and "decrypts" it to come up with the new last
header which is sent to Remailer 2.  Remailer 2 does the same thing,
etc.

See Mixminion paper, section 4.1:
http://mixminion.net/minion-design.pdf

>> Was bookkeeping complexity and brittleness-in-case-of-network-changes
>> the main problem with reply-block nymservers, or was it security
>> issues like:
>>
>> http://archives.seul.org/mixminion/dev/Oct-2007/msg00010.html
>
> Well, AIUI, issues with people actually using it were the brittleness
> of remailers going up and down. And I know that the complexity of the
> system let to lots of user errors on AAM.  The security issues like
> intersection attacks are great reasons it's not a good idea - not
> reasons it failed in deployment and use.

Good to know, thanks.  I'd love to hear more overview / history of the
remailer networks and what the obstacles are to wider deployment, if
you or Zax have more thoughts.

>>> I'm not too familiar with Bitcoin mining, but as I understand it, you
>>> can mine blocks on multiple blockchains at once.  Imagine two Tor
>>> networks, one run by Tor Project, and the second run by CCC.de.  A
>>> node could run on both networks, and it'd not be apparent which
>>> network you were using if you talked to it.  Similarly, the
>>> distributors in Pynchon Gate could be distributors for multiple
>>> nymservers.
>>
>> I guess, but this still means everyone has to agree to use the same
>> distributor nodes, or else user choice of their distributors will
>> partition users into potentially-small anonymity sets.  So I think
>> this implies Pynchon Gate would need to be a centralized
>> infrastructure, which brings a bunch of downsides.
>
> I tend to see the Tor network model as a hammer, and I apply it too
> often I think.  Nonetheless, why would a Tor-network-like set
> distributors, functioning for any number of nymservers, partition
> users?  If a nymserver was not a member of the 'Distributor
> Collective', then sure, but otherwise a user connecting to any
> distributor in the Collective could be accessing a nym for any
> nymserver.  The main problem I see with that is scaling for disk size
> and bandwidth.

If everyone's mail is stored on the same set of distributors (i.e. PIR
servers), and all users download their messages by contacting all the
distributors, then sure - all users are in the same anonymity set.

But that's the centralization problem I mentioned.  If you want to
choose your own PIR servers, different from the global set, then you
are only anonymous within the (probably smaller) set of users of those
servers.

Trevor