[messaging] Let's run a usability study (was Useability of public-key fingerprints)
Tom Ritter
tom at ritter.vg
Mon Apr 7 08:48:09 PDT 2014
On 27 March 2014 21:24, Trevor Perrin <trevp at trevp.net> wrote:
>> - We have two participants speaking fingerprints aloud to each other.
>> Do we want them to do it over a cell phone to add difficulty, or just
>> omit that bit?
>
> I think speaking over a phone is a good case to test, because in
> person you could also use QR codes, or just look at each other's
> screens. A landline might provide more consistent voice quality than
> cellphone.
It's true two tests over cellphones may get different voice quality.
I tend to think the difference between tests would in this case be
okay, because it's a real world factor people ahve to deal with... but
I could go either way.
> Other:
>
> * I think a printed biz card may not work well w/high-resolution
> visual fingerprints, so maybe doing it on a small phone screen is
> better?
I actually wasn't planning on doing a printed visual fingerprint... I
suppose I could though. I don't think the phone screen would work,
because it's not terribly common to have someone's fingerprint on your
phone and then try to verify it on your desktop.
> * When comparing aloud, I would suggest also having a time limit
> designed to provoke a fairly high error rate, so the same methodology
> is applied to all modes of use. Otherwise, there's two variables in
> the read-aloud case (time, error rate), so not as easy to compare
> different formats. Having the tester record "how many times the
> participants asks the other to repeat the last token, slow down, or
> otherwise change how they're reciting it" seems subjective and
> unnecessary.
I added in the time limit. I don't think it would be subjective (it
seems pretty clear to me that if someone asks for a repeat we record
it, if someone asks for them to slow down, we record it, etc). As for
as necessity, we could of course not capture that information, but I
feel like it's relevant. For example: if we get successful results
for English words with no repetitions, but successful results on
pseudowords with tons of repetitions - are english words not better?
> * Should there be a handwritten test, i.e. one user handwrites the
> fingerprint for the other?
I don't think it's worth the additional testing...
> * Regarding the "computationally chosen flaws" - I think you should
> randomize where the error goes, otherwise users will figure out it's
> always in the middle/inner chars, and game the test.
I want to make sure we test cases of just the middle N being
different, but I agree. I put in a 25/7% mix.
-tom
More information about the Messaging
mailing list