[messaging] Presence by DP5-PIR, GNS or multicast?

Thu Jan 15 04:30:44 PST 2015

Just saw Prof Danezis & Goldberg's 31c3 presentation
on using DP5 PIR for privacy-preserving look-ups of
presence. If you're not familiar with this new technology,
fetch the video from
	http://cdn.media.ccc.de/congress/2014/webm-sd/31c3-6140-en-de-DP5_PIR_for_Privacy-preserving_Presence_webm-sd.webm.torrent

This is obviously very interesting for developers like
us, so I have a big question: How does this differ from
using GNS, the GNU Naming System, for the same purpose?

I'll elaborate. I don't see the approach of PIR being
interesting for systems using the federation paradigm
since remote users would have to maintain their
presence on each server providing PIR presence to
their users. That neither scales very well, nor is
it likely to be effective at maintaining privacy.
This fits what Prof Goldberg elaborates on "sharding"
in the Q&A part, if I understood him correctly.

By the things Prof Goldberg says about IT-PIR it looks
like several (possibly cloud) servers are needed, not for 
distribution, but for the purpose of improving privacy.
What disturbs me in this case is that the primary purpose 
of our tools - the messaging - isn't ideally solved using 
the same architecture as DP5, so the presenters are 
proposing a complex multi-protocol architecture.

George imagines a hybrid approach. Mainting the XMPP
federation for messaging but integrating PIR as a central
service via a local proxy that intercepts presence traffic
and redirects it to the PIR database. Yet a few minutes
earlier he mentions that these services need to be run by 
trustworthy people like us - I haven't quite understood
why, but it cuts out the cloud model again.

Possibly the most privacy-respecting and scalable way of 
deploying such a service could have been to use a 
distributed data structure like one of the modern 
sybil-resistant DHTs. But if I understand Prof Goldberg's
Q&A statements correctly, PIR cannot actually be operated
in such a distributed manner as it needs to be able to do 
computational operations on the entire database. If that 
is correct, that would be quite a difference in design 
compared to GNS.

GNS however, being implemented on the most sybil-resistant 
DHT I know of, with the way it offers look-up privacy by 
combining the identity of the person being looked-up with 
a shared secret (for individuals or groups, that doesn't
matter), would arrive at similar results as PIR: A secure 
and likely scalable way to offer presence information to 
humanity. So I wonder if I missed something.

Also I gather that PIR becomes incrementally computionally
expensive the more participants use it. That is not the 
case with GNS. GNS should also work with a billion users,
whereas George's graphics suggest that running a DP5 for
a billion users would become very expensive.

So now I'm looking for further insights to be able to
judge which architecture would be best for a scalable
privacy-preserving presence protocol for humanity.
Is it DP5, GNS or anonymized multicasting?

Yes, because so far we of secushare have been considering
using pubsubs of anonymous multicasting over distributed
relays also for presence, rather than using database 
look-ups, because pubsubs push the information to 
the interested recipients in near real-time rather than 
having to poll the database. I don't know if polling the
database is a viable model from your point of view - I
just assume that to achieve acceptable realtimeness
you end up with a lot of overhead traffic. Reminds me
of ICQ in its times of scalability crisis, when people
would show up on the buddy list with fifteen minutes
of delay. That was around 1993 I think.

So the ultimate question would be, is anonymous multicast
anonymous enough and does it scale well enough such that
the higher efficiency in terms of bandwidth and latency
come to fruition and beat both DP5 and GNS. I am not sure
if the many research papers on this topic are elaborate
enough yet.

Btw, I love the "Why not just use Tor?" part of the 
talk. It nicely explains how the current mainstream model
of private messaging, XMPP+OTR+Tor, is pretty insufficient.
I assume the scientific background to that assertion is the
2009 paper on "De-anonymizing Social Networks" by Arvind 
Narayanan and Vitaly Shmatikov which demonstrates the
correlation of Twitter and Flickr users by the similarity
of the social graphs. An attacker that has access to both
the jabber.ccc.de database and, say, the Facebook social
graph, would be able to de-anonymize a relevant number of
jabber.ccc.de users. So, dear CCC, please ensure that 
server is kept in a safe place until we have a better 
messaging standard than XMPP!  :)

-- 
	    http://youbroketheinternet.org
 ircs://psyced.org/youbroketheinternet