<div dir="ltr"><div>An attempt to summarize two problem areas being discussed in this thread: </div><div><br></div><div>A) Long-term key fingerprints probably need to be a bijection from 128-bit long bitstrings to human-friendly form. We can lower this space (as Trevor mentioned) by key stretching: searching for a random nonce (or random public key) which happens to produce a 128-bit hash starting with N zeros, which are truncated. One can trade-off generation cost vs. verification cost here, so if you can do 28 bits of work at generation time effectively you need to transmit and check a 100-bit value. Long-term key fingerprints should be as easy as possible to display (business cards), typeable, writeable, checkable, and (as a bonus) pronouncebale.</div>
<div><br></div><div>B) The requirements for ephemeral authentication secrets vary by protocol, but in the simplest case (e.g. Socialist millionaire) they be anything and only need to be about 30-40 bits. In that case all we really need is an invertible function from 30-40 random bits to a value that is easy to recognize and (as a bonus) pronounce. In other cases they'll have to be generated as a hash of a longer value. The best setup is probably an invertible function E applied to a truncated hash.</div>
<div><br></div><div>There are possibly other use cases (including keys that can actually be committed to memory), but these two seem to be the main areas of interest..</div><div><br></div><div>There is probably quite a bit of overlap between them. I like the word list approach, especially for (B), though I'd sound a note of caution on Diceware-the list contains a lot of words that aren't easy to recognize or type. I'd advocate for words of consistent length (4-6 characters) with edit distance at least 2 between any pair, which would be nice so that smartphone typing works. I whipped up one of these a while back, and ended up with about 750 words. I'd guess around 1k is the limit</div>
<div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Feb 6, 2014 at 10:48 PM, Trevor Perrin <span dir="ltr"><<a href="mailto:trevp@trevp.net" target="_blank">trevp@trevp.net</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="">On Thu, Feb 6, 2014 at 1:40 AM, Daniel Thomas <<a href="mailto:drt24%2Bmessaging@cam.ac.uk">drt24+messaging@cam.ac.uk</a>> wrote:<br>
> On 06/02/14 01:35, Trevor Perrin wrote:<br>
>> I like the smaller size of the pseudowords, particularly for<br>
>> transcribing these things, spelling out the characters over the phone,<br>
>> or viewing on a small screen. And a lot of the words are unusual so<br>
>> are going to need to be spelled out.<br>
>><br>
>> But it would be interesting to see what a better wordlist looks like.<br>
><br>
> Diceware[0] is has a (fairly short 7776) word list in multiple languages<br>
> for the purpose of generating easy to remember passphrases.<br>
<br>
<br>
</div>Hi Daniel,<br>
<br>
That's about a 13-bit list, which seems like a good middle ground<br>
between an 8-bit list (like PGPfone) or a 16-bit list. An 8-bit list<br>
necessitates 16 word fingerprints for 128-bit security, which feels<br>
like too many words. A 16-bit list contains 65K words, which is more<br>
than most people's vocabulary, meaning a lot of unusual words that<br>
would have to be spelled out.<br>
<br>
The Diceware dictionary is designed around short words and word<br>
fragments (it includes numbers, punctuation, and non-words, which is a<br>
bit weird IMO). I wrote a script to generate 10 random Diceware words<br>
to see what fingerprints might look like:<br>
<br>
<a href="https://github.com/trevp/keyname" target="_blank">https://github.com/trevp/keyname</a><br>
<br>
<br>
hop - flu - urn - belie - gogo - gravy - mayor - avow - plush - enter<br>
<br>
bump - seem - soft - lm - plane - exit - plus - stilt - behind - malta<br>
<br>
tract - rude - rhine - ready - climb - fell - fell - reek - cody - kudzu<br>
<br>
bunch - sound - adler - galt - signor - glom - soup - on - lund - juju<br>
<br>
essay - eave - ef - pro - stung - gn - smash - josef - vetch - busy<br>
<br>
dawson - tic - vy - cake - rock - sr - store - ice - plunk - gp<br>
<br>
old - swept - win - mike - xy - chill - seethe - allow - alva - jh<br>
<br>
grace - curia - coke - rebut - 15 - foray - jaw - weco - anvil - buenos<br>
<br>
pn - adair - swelt - faith - slash - berlin - watch - blood - start - santa<br>
<br>
grow - del - bon - 99th - kepler - cam - fun - 37th - dryad - prone<br>
<br>
<br>
Below compares 5 diceware fingerprints side-by-side with 5 pseudoword<br>
fingerprints of score=18. The pseudoword fingerprints took an average<br>
of ~30 seconds apiece to generate on a single core of my Macbook Air.<br>
(The max possible score is 20, a score of 18 means 2 deviations from<br>
vowel/consonant alternation):<br>
<br>
<br>
oman - swath - haze - elmer - gouda - admix - feat - afar - reel - for<br>
<br>
ukigex - 3kiw - jejod - yvak - rewupa<br>
<br>
<br>
blitz - teal - emma - bambi - queen - 92 - mecum - om - derek - twa<br>
<br>
lijuv7 - woxm - pokoj - cixa - ehajen<br>
<br>
<br>
op - zomba - 84th - soy - oval - evolve - spook - fk - ghi - magog<br>
<br>
syivoh - upim - leewo - hoda - madeso<br>
<br>
<br>
piotr - vain - david - mk - gasp - buoy - malt - az - hang - rena<br>
<br>
bewora - zutm - hirub - ugux - tlezeb<br>
<br>
<br>
perk - fate - cinch - gulf - jb - marks - wag - canoe - sprig - maw<br>
<br>
ripoyu - ime2 - fenef - aqos - lehnof<br>
<br>
<br>
Both approaches seem pretty decent, not sure which is best. Choosing<br>
13-bit wordlists for different languages and dealing with<br>
cross-language compatibility seems a hassle, but so is computing tens<br>
of millions of hashes for a fingerprint.<br>
<br>
There's a lot more that could be done here: e.g. make a better<br>
wordlist than Diceware, or optimize the pseudoword search and do<br>
better scoring.<br>
<br>
If anyone wants to do UX research, these would be great projects...<br>
<div class="HOEnZb"><div class="h5"><br>
<br>
Trevor<br>
_______________________________________________<br>
Messaging mailing list<br>
<a href="mailto:Messaging@moderncrypto.org">Messaging@moderncrypto.org</a><br>
<a href="https://moderncrypto.org/mailman/listinfo/messaging" target="_blank">https://moderncrypto.org/mailman/listinfo/messaging</a><br>
</div></div></blockquote></div><br></div>