[messaging] Using Elasticsearch on the SKS/GPG keyserver pool

zaki at manian.org zaki at manian.org
Wed Apr 8 09:21:38 PDT 2015


I'm starting to believe that Elasticsearch aggregations will be able to
produce equivalent results to a graph search.

http://blog.qbox.io/elasticsearch-aggregations

http://www.elastic.co/guide/en/elasticsearch/reference/1.x/search-aggregations.html

But I'm also familiar with cayley.

Definitely an interesting project for the weekend.


On Wed, Apr 8, 2015 at 8:19 AM, David Leon Gil <coruus at gmail.com> wrote:

> One suggestion: Post to the IETF OpenPGP mailing list about this, if you
> haven't. (I think I've posted some similar statistics there a while back.)
>
> --
>
> I, personally, found that for simple questions, it was fastest to just
> scan through a keydump with a C parser. The great virtue of your approach
> is that it allows asking much more complicated questions!
>
> (I tried to modify python-pgpdump, fork at
> https://github.com/coruus/python-pgpdump, but I've only tested it on a
> single keydump file.)
>
> --
>
> If you are at all familiar with Go, it has both
>
> - a generally good OpenPGP implementation that can verify signatures,
> golang.org/x/crypto/openpgp
> - and a good graph database, github.com/google/cayley
>
> which might make it easier to answer the questions you raise.
>
> (Sadly, I don't have the time to do anything about this but thank you for
> making this information easily searchable.)
>
> - David
>
>
> On Tue, Apr 7, 2015 at 6:06 PM Daniel Roesler <diafygi at gmail.com> wrote:
>
>> Howdy all,
>>
>> I've been running a keyserver in the SKS keyserver pool[1][2] for a
>> few months (which is what GnuPG uses as its default keyserver), and I
>> recently began to wonder what cool stats I could find out about all
>> those keys. Unfortunately, the sks keyserver system is really only
>> setup for simple searching, which meant I needed to dump the database
>> to another repo.
>>
>> So I wrote a python OpenPGP parser[3] (mostly to learn more about the
>> OpenPGP format) that converted PGP keys to json, then dumped the keys
>> into an elasticsearch index[4]. Using json and elasticsearch makes it
>> pretty easily to make some interesting queries on the keys. I'd love
>> feedback and people to see what they can find. You can also download
>> the raw json dump files[5].
>>
>> Here's some of my cool stats so far:
>>
>> Total keys[6]: 3.9 million
>> Total keys with pictures[7]: 59k
>> Total RSA keys[8]: 1.3 million
>> Total DSA keys[9]: 2.7 million
>> Total ECDSA keys[10]: 408
>> Total EdDSA keys[11]: 112
>>
>> Here's some questions I'd like to explore (mostly around signature
>> verification):
>>
>> * How many key signatures cannot be verified (i.e. their issuer is not
>> in the keyserver)?
>> * How many key signatures are verified?
>> * How many key signatures are invalid?
>> * Who has the most faked signature?
>> * A visualization of the verified signature web clusters?
>> * A visualization of the invalid signature web clusters?
>>
>> I'm brand new to elasticsearch, so I encourage people with actual data
>> science skills to explore on their own (either with the raw json dump
>> or the elasticsearch instance). I've provided instructions for how to
>> recreate the json dump and the elasticsearch repo in a github
>> repo[12]. The json includes the signature payload data so you can
>> actually verify signatures (I just haven't done that yet).
>>
>> Thoughts? Feedback?
>>
>> Thanks!
>> Daniel Roesler
>> diafygi at gmail.com
>>
>> [1]: https://sks-keyservers.net/status/
>> [2]: https://http://sks.daylightpirates.org/
>> [3]: https://github.com/diafygi/openpgp-python
>> [4]: https://keyserver-elasticsearch.daylightpirates.org/
>> [5]: https://keyserver-elasticsearch.daylightpirates.org/dump/
>> [6]: https://keyserver-elasticsearch.daylightpirates.
>> org/keyserver/_count?pretty=1
>> [7]: https://keyserver-elasticsearch.daylightpirates.
>> org/keyserver/_search?q=JPEG&fields=packets.subpackets.
>> encoding&_source_include=key_id,packets.user_id,packets.
>> subpackets.image&pretty=1
>> [8]: https://keyserver-elasticsearch.daylightpirates.
>> org/keyserver/_search?q=RSA&fields=algo_name&_source_
>> include=key_id,algo_name&pretty=1
>> [9]: https://keyserver-elasticsearch.daylightpirates.
>> org/keyserver/_search?q=DSA&
>> fields=algo_name&_source_include=key_id,algo_name&pretty=1
>> <https://keyserver-elasticsearch.daylightpirates.org/keyserver/_search?q=DSA&fields=algo_name&_source_include=key_id,algo_name&pretty=1>
>> [10]: https://keyserver-elasticsearch.daylightpirates.
>> org/keyserver/_search?q=ECDSA&fields=algo_name&_source_
>> include=key_id,algo_name&pretty=1
>> [11]: https://keyserver-elasticsearch.daylightpirates.
>> org/keyserver/_search?q=EdDSA&fields=algo_name&_source_
>> include=key_id,algo_name&pretty=1
>> [12]: https://github.com/diafygi/keyserver-elasticsearch
>> _______________________________________________
>> Messaging mailing list
>> Messaging at moderncrypto.org
>> https://moderncrypto.org/mailman/listinfo/messaging
>>
>
> _______________________________________________
> Messaging mailing list
> Messaging at moderncrypto.org
> https://moderncrypto.org/mailman/listinfo/messaging
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://moderncrypto.org/mail-archive/messaging/attachments/20150408/5f4b2964/attachment.html>


More information about the Messaging mailing list