[messaging] Test Data for the Usability Study

Tue Jun 24 21:01:19 PDT 2014

On Mon, Jun 23, 2014 at 2:34 PM, Tom Ritter <tom at ritter.vg> wrote:
> I implemented this on a branch
> (https://github.com/tomrittervg/crypto-usability-study/commit/9df0e72f15391128b6b067e891323363780cb451
> ), and ran into three issues:
>
> 1) I also am not sure if, when we flip the bits, they should be
> flipped at random, or just negated.  My gut says negated...
> 2) The 850 word corpus does not translate directly into an even number
> of bits.  I wound making it 14 words, each representing 9 bits (using
> 512 of the words)
> 3) The more I thought about it, and then verified, the fingerprints
> barely match at all.
>
> Negation:
> wood - be - jump - though - punishment - for - company - animal - far
> - you - unit - snow - cover - father
> disease - society - wool - punishment - to - even - edge - again -
> hour - base - wood - as - amusement - daughter
>
> Random:
> attention - smell - behavior - smile - rain - the - wood - food -
> stage - get - almost - competition - increase - earth
> birth - cough - apparatus - soap - knowledge - of - band - friend -
> snow - get - then - stretch - belief - earth

Yeah, to point out the obvious - if those are supposed to be fuzzy
matches from a 2^80 attacker, they're not very good.

If each word encodes 9 bits, and you're trying to simulate an attacker
who can do ~2^80 work, why don't you just set 9 of the words equal?

The poem generator uses some bits to determine grammatical structure,
and most of the bits to choose words.  So maybe set the structure the
same, and then use the rest of the bits to set some number of words
equal?

This is rough, obviously soundalike / lookalike metrics would be
better, but for a first cut maybe it's good enough?

Trevor