[messaging] Let's run a usability study (was Useability of public-key fingerprints)

Daniel Kahn Gillmor dkg at fifthhorseman.net
Mon Mar 10 09:36:34 PDT 2014


On 03/10/2014 04:14 AM, Tom Ritter wrote:
> As promised, here's a first-pass at a proposal:
> https://github.com/tomrittervg/crypto-usability-study

Thanks for the starting push, Tom.  It's good to have something concrete
to consider.

I'm concerned that as outlined, there are a few problems:

 0) "reading aloud" comes from a video to keep a constant rate

this does not match my experience in how fingerprints are compared when
read aloud.  in practice, even if there are multiple listeners, there is
often pushback from the listeners, asking for a pair of octets to be
repeated, or for the reader to slow down.  a video played at constant
speed (without the user able to control it) doesn't feel at all like the
same interaction, let alone representative of the same cognitive
challenge.  I'm not sure how to solve this.  maybe we just focus on the
business card approach?

 1) error rate for computationally-chosen flaw

the spec currently suggests we'd run the ssh fingerprint look-alike
tool.  assuming this tool works, it seems clear that this would bias the
results against the hex fingerprint, and toward the pseudoword or
english word outcome, since the look-alike tool currently embeds some
knowledge about sensitive cognitive comparisons in the hexadecimal
output space (e.g that it is a "better match" to flip two bits in the
first or second nybble of an octet than to flip one bit in each nybble).
 maybe we could craft a look-alike tool for each of the different
mechanisms, encoding whatever domain-specific knowledge we have?

 2) prevalence

are we planning on showing the users three fingerprints, one which
matches exactly, one with a subtle flaw, and one with the
computationally chosen flaw?  This would result in a mismatch prevalence
of 67% -- far, far higher than the prevalence of actual fingerprint
mismatches in most daily use.  a known baseline prevalence rate seems
likely to affect the way most users think about these matches.

I'm trying to think about what the user experience of the study would be
-- do they see the fingerprints to match in rapid succession?  are they
embedded in some other task?  (i think the other task would be the "head
fake" you describe; i'm not sure what that task would be, though).

	--dkg

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 1010 bytes
Desc: OpenPGP digital signature
URL: <http://moderncrypto.org/mail-archive/messaging/attachments/20140310/93acfabf/attachment.sig>


More information about the Messaging mailing list