[messaging] Panoramix decryption mixnet messaging spec and design documents
dawuud at riseup.net
Mon Nov 27 18:25:37 PST 2017
In the Loopix paper at the end of section 4.1.2 it says:
If the number of messages in client’s inbox
is smaller than a constant threshold, the provider gen-
erates additional dummy messages. Thus, the adversary
observing the client-provider connection, as presented on
Figure 3, cannot learn how many messages were in the
user’s inbox. Note that, as long as the providers are hon-
est, the protection and receiver unobservability is perfect
and the adversary cannot learn any information about the
inbox and outbox of any client.
If the provider is dishonest, then they are still uncer-
tain whether a received message is genuine or the result
of a client loop – something that cannot be determined
from their bit pattern alone. However, further statistical
attacks may be possible, and we leave quantifying exact
security against this threat model as future work.
On Fri, Nov 24, 2017 at 04:09:50PM +0000, Michael Rogers wrote:
> On 21/11/17 02:12, dawuud wrote:
> > Katzenpost is delivery by pull as well. Clients retreive messages from their Provider.
> > This paper:
> > [DUMMYINTERSECTION] Berthold, O., Langos, H.,
> > "Dummy Traffic Against Long Term Intersection Attacks",
> > In the Proceedings of the PETS 2002,
> > <https://www.freehaven.net/anonbib/cache/langos02.pdf>.
> > mentions that there are some constraints needed for the intersection attack to work:
> > a. the adversary is able to watch all user lines
> > b. the adversary is able to watch all Provider/server lines (see figure 1 in paper)
> > c. the adversary knows how long it takes for a message to traverse the mix network
> > d. messages belonging to a certain session have to have the same property
> > that allows the adversary to link them each together
> > So in Katzenpost I don't think constraint "c" is met since we aren't using
> > synchronized batch mixing... but rather the Poisson mix strategy where
> > clients select the mix delay for each hop in the route which should be different
> > for every sent message, even retransmits... although client route hop delays are sampled
> > from an exponential distribution. The attacker can of course compute the
> > probability of a certain route delay having been choosen by a client.
> Yes, I think the attack would still be possible given a probability
> distribution over delays instead of a fixed delay.
> My starting point here is the statistical disclosure attack
> which says that if I'm watching the inputs and outputs of a round-based
> anonymity system, and I see Alice sending messages into the system, and
> for each round where Alice sends a message I see all the messages
> exiting the system, then starting from a uniform probability
> distribution over all possible recipients I can refine that distribution
> with every observation, to eventually discover Alice's communication
> partners. The longer I watch the system, the closer my distribution gets
> to Alice's real distribution.
> I think we can generalise that by replacing the rounds with a
> probability distribution over delays. So for each message Alice sends
> into the system, instead of having a set of output messages that
> definitely belong to the same round, I assign every future output
> message a probability of being Alice's message based on the delay. But
> the rest of the attack remains the same: I use the probabilities
> assigned to output messages to update the probability distribution over
> possible recipients, and over time that distribution approaches Alice's
> real distribution.
> Of course the Loopix authors know this, hence all the dummy traffic to
> prevent an attacker from knowing which users are actually sending and
> receiving messages. I think the big question is whether that provides a
> useful amount of protection at a reasonable level of overhead.
> > I also cannot think of a way that "d" is met... unless of course there
> > is collusion with the destination Provider.
> If I understand right, condition d just means there must be some set of
> messages you're interested in, and you must know which messages they are.
> In the statistical disclosure paper, the session is all messages sent by
> Alice and the goal is to identify the set of recipients. In the Berthold
> and Langos paper it's the other way round - the session is all messages
> sent by some unknown sender and the goal is to identify the sender. But
> either way, there's some set of interesting messages and the goal is to
> separate them from the uninteresting messages.
> Just as we can generalise from rounds to a probability distribution over
> delays, I wonder if we can generalise from a fixed set of interesting
> messages to a probability of being interesting.
> So in a system that uses dummy traffic, when I see Alice sending a
> message into the system, I assign it a probability of being a real
> message. Then I assign every future output message a probability of
> being the same message based on the delay. And I update the recipient's
> probability of being one of Alice's communication partners based on the
> product of those two probabilities.
> In theory the attack still works, but my distribution converges more
> slowly. On the other hand, the system is also more expensive to use. So
> it comes down to a practical question: how long can the system conceal
> sender-recipient relationships, at what cost in dummy traffic?
> I'd love to see some back-of-an-envelope calculations of this for
> Katzenpost. If Alice sends and receives n dummy bytes for each real
> byte, what does that do to the attacker's probability distribution over
> possible communication partners? How much longer does it take the
> attacker's distribution to converge?
> pub rsa2048 2012-04-09 [SCA] [expires: 2018-02-06]
> uid Michael Rogers <michael at briarproject.org>
> sub rsa2048 2012-04-09 [E] [expires: 2018-02-06]
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 833 bytes
Desc: not available
More information about the Messaging