[messaging] "Pseudoword" base32 fingerprints

Trevor Perrin trevp at trevp.net
Sun Feb 9 23:00:11 PST 2014

On Fri, Feb 7, 2014 at 4:54 PM, Joseph Bonneau <jbonneau at gmail.com> wrote:
> An attempt to summarize two problem areas being discussed in this thread:
> A) Long-term key fingerprints probably need to be a bijection from 128-bit
> long bitstrings to human-friendly form. We can lower this space (as Trevor
> mentioned) by key stretching: searching for a random nonce (or random public
> key) which happens to produce a 128-bit hash starting with N zeros, which
> are truncated.

That's similar to what I proposed, but not identical.

You're proposing the fingerprint generator could search for a hash
that starts with (say) 28 bits of zeros, then encode the following 100
bits (e.g. via word lists or syllable lists).  The verifier would
decode the 100 bits, then add the implied 28 zero bits to recover the

The downside is that your verifier needs to be aware of the
user-friendly encoding (e.g. the word or syllable list), and the
number of implied zero bits, so these have to be standardized or
encoded somehow.  Also, your generator *must* search to get a 128-bit
security level, since you're only encoding 100 bits.

In my proposal the verifier just handles a base32 fingerprint without
knowing word or syllable lists or how much searching the generator
did.  Also, a generator who doesn't want to search still gets 125 bits
of security as 125 bits of the hash are encoded.

(A downside of my proposal is that scoring base32 is more expensive
than checking for zero bytes.  In my code the difference is ~2.2
million/sec vs ~3.3 million, but I think most of the difference could
be removed with optimization.)

> B) The requirements for ephemeral authentication secrets vary by protocol,
> but in the simplest case (e.g. Socialist millionaire) they be anything and
> only need to be about 30-40 bits. In that case all we really need is an
> invertible function from 30-40 random bits to a value that is easy to
> recognize and (as a bonus) pronounce.

I wouldn't include OTR's Socialist Millionaire Protocol here, as it's
based on questions ("where did we go for your birthday?"), not random
values.  Short Auth Strings would fit here.  Though I'd argue SAS work
best in audio/video, so verbal useability is most important.  Not sure
I agree that 30-40 bits are necessary, ZRTP allows 16-bit Short Auth
Strings (i.e. two PGPfone words).

> There is probably quite a bit of overlap between them. I like the word list
> approach, especially for (B), though I'd sound a note of caution on
> Diceware-the list contains a lot of words that aren't easy to recognize or
> type.

Yeah, I'd like to see a better 13-bit wordlist.

> I'd advocate for words of consistent length (4-6 characters) with edit
> distance at least 2 between any pair, which would be nice so that smartphone
> typing works. I whipped up one of these a while back, and ended up with
> about 750 words. I'd guess around 1k is the limit

For around a 10-bit wordlist, Michael Rogers suggested Basic English.
See comparison below (Basic English vs Diceware vs base32

Obviously, smaller wordlists means longer fingerprints but better
words.  Not sure what the best point on that tradeoff is.

I think I still like pseudowords more than wordlists, particularly due
to the simplicity of specification (no need to standardize on
wordlists) and the localization advantage (works anywhere people
understand latin chars and arabic numerals, without needing new
wordlists for every language).

Basic English (13 words from ~10 bit list (850 words)) vs
Diceware (10 words from ~13 bit list (7776 words)) vs
Base32 pseudowords (score=18)

cat - cut - snow - white - laugh - harbor - pin - pump - make - week -
farm - left - coat

possible - peace - note - screw - quite - equal - knee - root - slope
- rate - brass - rail - great

cut - please - thunder - insurance - smooth - sister - thread -
purpose - get - bit - please - tray - military

late - representative - value - against - answer - exchange - join -
brake - against - bone - gold - man - harmony

rapid - dab - duma - above - vise - uo - nbs - huff - pj - code

tend - yo - israel - curb - hz - colza - lace - waltz - timex - knox

trash - wack - cancer - rummy - pivot - clove - debby - pat - gown - remit

farad - yule - crest - zeus - keats - flaw - kahn - libya - wj - edify

qufok3 - wigi - yiluz - wims - tiqiwi

ofoyde - zcok - xelol - teya - uxirij

qupiyo - pnax - fayal - noyt - opabiq

saefap - gowo - uziny - dofi - usejey


More information about the Messaging mailing list