[messaging] Using Elasticsearch on the SKS/GPG keyserver pool

Daniel Roesler diafygi at gmail.com
Tue Apr 7 18:05:38 PDT 2015


Howdy all,

I've been running a keyserver in the SKS keyserver pool[1][2] for a
few months (which is what GnuPG uses as its default keyserver), and I
recently began to wonder what cool stats I could find out about all
those keys. Unfortunately, the sks keyserver system is really only
setup for simple searching, which meant I needed to dump the database
to another repo.

So I wrote a python OpenPGP parser[3] (mostly to learn more about the
OpenPGP format) that converted PGP keys to json, then dumped the keys
into an elasticsearch index[4]. Using json and elasticsearch makes it
pretty easily to make some interesting queries on the keys. I'd love
feedback and people to see what they can find. You can also download
the raw json dump files[5].

Here's some of my cool stats so far:

Total keys[6]: 3.9 million
Total keys with pictures[7]: 59k
Total RSA keys[8]: 1.3 million
Total DSA keys[9]: 2.7 million
Total ECDSA keys[10]: 408
Total EdDSA keys[11]: 112

Here's some questions I'd like to explore (mostly around signature
verification):

* How many key signatures cannot be verified (i.e. their issuer is not
in the keyserver)?
* How many key signatures are verified?
* How many key signatures are invalid?
* Who has the most faked signature?
* A visualization of the verified signature web clusters?
* A visualization of the invalid signature web clusters?

I'm brand new to elasticsearch, so I encourage people with actual data
science skills to explore on their own (either with the raw json dump
or the elasticsearch instance). I've provided instructions for how to
recreate the json dump and the elasticsearch repo in a github
repo[12]. The json includes the signature payload data so you can
actually verify signatures (I just haven't done that yet).

Thoughts? Feedback?

Thanks!
Daniel Roesler
diafygi at gmail.com

[1]: https://sks-keyservers.net/status/
[2]: https://http://sks.daylightpirates.org/
[3]: https://github.com/diafygi/openpgp-python
[4]: https://keyserver-elasticsearch.daylightpirates.org/
[5]: https://keyserver-elasticsearch.daylightpirates.org/dump/
[6]: https://keyserver-elasticsearch.daylightpirates.org/keyserver/_count?pretty=1
[7]: https://keyserver-elasticsearch.daylightpirates.org/keyserver/_search?q=JPEG&fields=packets.subpackets.encoding&_source_include=key_id,packets.user_id,packets.subpackets.image&pretty=1
[8]: https://keyserver-elasticsearch.daylightpirates.org/keyserver/_search?q=RSA&fields=algo_name&_source_include=key_id,algo_name&pretty=1
[9]: https://keyserver-elasticsearch.daylightpirates.org/keyserver/_search?q=DSA&
fields=algo_name&_source_include=key_id,algo_name&pretty=1
[10]: https://keyserver-elasticsearch.daylightpirates.org/keyserver/_search?q=ECDSA&fields=algo_name&_source_include=key_id,algo_name&pretty=1
[11]: https://keyserver-elasticsearch.daylightpirates.org/keyserver/_search?q=EdDSA&fields=algo_name&_source_include=key_id,algo_name&pretty=1
[12]: https://github.com/diafygi/keyserver-elasticsearch


More information about the Messaging mailing list