[messaging] Let's run a usability study (was Useability of public-key fingerprints)

Thu Mar 27 18:24:43 PDT 2014

On Sun, Mar 23, 2014 at 10:21 AM, Tom Ritter <tom at ritter.vg> wrote:
> About a week late, but updated:
> https://github.com/tomrittervg/crypto-usability-study
>
> Some of the larger Open Questions:
>  - Are we settled on unicorns? (This is more about how it's generated:
> http://unicornify.appspot.com/making-of)

The "vash" Daniel Thomas mentioned seemed interesting, though I
couldn't manage to build it.  I would consider that, the OpenSSH
visual art, or "Random Art" from [1] since they're explicitly designed
for fingerprints, unlike the unicorn generator.  [1] is the "seminal"
academic reference, so it might be best, if you could get the code.

>  - We have two participants speaking fingerprints aloud to each other.
> Do we want them to do it over a cell phone to add difficulty, or just
> omit that bit?

I think speaking over a phone is a good case to test, because in
person you could also use QR codes, or just look at each other's
screens.  A landline might provide more consistent voice quality than
cellphone.

>  - We're settled on not trying to do a head-fake?

Seems right to me.

Other:

 *  I think a printed biz card may not work well w/high-resolution
visual fingerprints, so maybe doing it on a small phone screen is
better?

 * When comparing aloud, I would suggest also having a time limit
designed to provoke a fairly high error rate, so the same methodology
is applied to all modes of use.  Otherwise, there's two variables in
the read-aloud case (time, error rate), so not as easy to compare
different formats.  Having the tester record "how many times the
participants asks the other to repeat the last token, slow down, or
otherwise change how they're reciting it" seems subjective and
unnecessary.

 * Should there be a handwritten test, i.e. one user handwrites the
fingerprint for the other?

 * Regarding the "computationally chosen flaws" - I think you should
randomize where the error goes, otherwise users will figure out it's
always in the middle/inner chars, and game the test.

 * Might be interesting to discuss all this at the EFF CUPS workshop
[2], or one of the other SOUPS workshops in July, submission deadlines
are May 15.

Trevor

[1] https://sparrow.ece.cmu.edu/group/pub/old-pubs/validation.pdf
[2] https://cups.cs.cmu.edu/soups/2014/workshops/effcup.html