[messaging] Using Elasticsearch on the SKS/GPG keyserver pool

David Leon Gil coruus at gmail.com
Wed Apr 8 08:19:32 PDT 2015

One suggestion: Post to the IETF OpenPGP mailing list about this, if you
haven't. (I think I've posted some similar statistics there a while back.)


I, personally, found that for simple questions, it was fastest to just scan
through a keydump with a C parser. The great virtue of your approach is
that it allows asking much more complicated questions!

(I tried to modify python-pgpdump, fork at
https://github.com/coruus/python-pgpdump, but I've only tested it on a
single keydump file.)


If you are at all familiar with Go, it has both

- a generally good OpenPGP implementation that can verify signatures,
- and a good graph database, github.com/google/cayley

which might make it easier to answer the questions you raise.

(Sadly, I don't have the time to do anything about this but thank you for
making this information easily searchable.)

- David

On Tue, Apr 7, 2015 at 6:06 PM Daniel Roesler <diafygi at gmail.com> wrote:

> Howdy all,
> I've been running a keyserver in the SKS keyserver pool[1][2] for a
> few months (which is what GnuPG uses as its default keyserver), and I
> recently began to wonder what cool stats I could find out about all
> those keys. Unfortunately, the sks keyserver system is really only
> setup for simple searching, which meant I needed to dump the database
> to another repo.
> So I wrote a python OpenPGP parser[3] (mostly to learn more about the
> OpenPGP format) that converted PGP keys to json, then dumped the keys
> into an elasticsearch index[4]. Using json and elasticsearch makes it
> pretty easily to make some interesting queries on the keys. I'd love
> feedback and people to see what they can find. You can also download
> the raw json dump files[5].
> Here's some of my cool stats so far:
> Total keys[6]: 3.9 million
> Total keys with pictures[7]: 59k
> Total RSA keys[8]: 1.3 million
> Total DSA keys[9]: 2.7 million
> Total ECDSA keys[10]: 408
> Total EdDSA keys[11]: 112
> Here's some questions I'd like to explore (mostly around signature
> verification):
> * How many key signatures cannot be verified (i.e. their issuer is not
> in the keyserver)?
> * How many key signatures are verified?
> * How many key signatures are invalid?
> * Who has the most faked signature?
> * A visualization of the verified signature web clusters?
> * A visualization of the invalid signature web clusters?
> I'm brand new to elasticsearch, so I encourage people with actual data
> science skills to explore on their own (either with the raw json dump
> or the elasticsearch instance). I've provided instructions for how to
> recreate the json dump and the elasticsearch repo in a github
> repo[12]. The json includes the signature payload data so you can
> actually verify signatures (I just haven't done that yet).
> Thoughts? Feedback?
> Thanks!
> Daniel Roesler
> diafygi at gmail.com
> [1]: https://sks-keyservers.net/status/
> [2]: https://http://sks.daylightpirates.org/
> [3]: https://github.com/diafygi/openpgp-python
> [4]: https://keyserver-elasticsearch.daylightpirates.org/
> [5]: https://keyserver-elasticsearch.daylightpirates.org/dump/
> [6]: https://keyserver-elasticsearch.daylightpirates.
> org/keyserver/_count?pretty=1
> [7]: https://keyserver-elasticsearch.daylightpirates.
> org/keyserver/_search?q=JPEG&fields=packets.subpackets.
> encoding&_source_include=key_id,packets.user_id,packets.
> subpackets.image&pretty=1
> [8]: https://keyserver-elasticsearch.daylightpirates.
> org/keyserver/_search?q=RSA&fields=algo_name&_source_
> include=key_id,algo_name&pretty=1
> [9]: https://keyserver-elasticsearch.daylightpirates.
> org/keyserver/_search?q=DSA&
> fields=algo_name&_source_include=key_id,algo_name&pretty=1
> <https://keyserver-elasticsearch.daylightpirates.org/keyserver/_search?q=DSA&fields=algo_name&_source_include=key_id,algo_name&pretty=1>
> [10]: https://keyserver-elasticsearch.daylightpirates.
> org/keyserver/_search?q=ECDSA&fields=algo_name&_source_
> include=key_id,algo_name&pretty=1
> [11]: https://keyserver-elasticsearch.daylightpirates.
> org/keyserver/_search?q=EdDSA&fields=algo_name&_source_
> include=key_id,algo_name&pretty=1
> [12]: https://github.com/diafygi/keyserver-elasticsearch
