<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>2 persons should be enough for a pilot to work out kinks, and 30<br> people would work in an academic setting for the statistics we want.<br></div></blockquote><div><br></div><div>(a) Could you present a sketch of the experimental design? Exactly what would each participant be asked to do step-by-step?</div> <div> (b) Given that could you calculate a statistical power argument for why N=30 is satisfactory given the effect size you're hypothesizing exists?</div><div> </div><div>Putting external validity concerns aside I'm very skeptical that N=30 is close to enough.</div> <div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div> A Mech Turk study could be complementary but it doesn't really<br> replace. The idea for a scientific study would be to vary one variable<br> at a time (the fingerprint mechanism) and a mechanical turk study<br> varies potentially thousands (lighting, time of day, tiredness,<br> screen, instructions, etc). </div></blockquote><div><br></div><div>I disagree strongly. Several of the factors you mentioned (tiredness, time of day) are going to vary regardless of an in-person vs. online study. These factors are addressed by randomly assigning participants to treatments and doing statistical testing to control for variance. Larger sample size is the only real cure. Since they will vary in real life (people don't only check fingerprints at 11 AM on a 17-inch screen), you're losing external validity by fixing them, even if it may slightly reduce experimental noise.</div> <div><br></div><div>The instructions given are strictly easier to control in an online study and you largely eliminate the potential for observer-expectancy effects.</div><div><br></div><div>Most importantly with an online study you'll get a much more diverse participant pool, and they'll do your task in the middle of tons of other computer work in their own home, which is far closer to reality than driving into a lab and sitting down to do research.</div> <div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>It could be good to do the scientific<br> study to inform a larger mechanical turk study varying more variables.<br> <br> As for so many trials, the "learning effect" is a thing, and you can<br> account for it statistically by varying the order the trials are<br> presented in per subject.<br></div></blockquote><div><br></div><div>Yes, but with N=30 you don't have nearly enough people to present every possible permutation of 10 treatments, so you will be introducing potential error. Also, this approach further weakens validity: In real life people never "learn" how to compare fingerprints by trying 10 different ways to do it in a row. </div> </div></div></div>