<div dir="ltr"><div style="font-family:arial,sans-serif;font-size:12.727272033691406px">it's a bit hackish but a simple pass would be to use nltk</div><div style="font-family:arial,sans-serif;font-size:12.727272033691406px"> here's an example gist out there on getting pronunciation </div><div style="font-family:arial,sans-serif;font-size:12.727272033691406px"><a href="https://gist.github.com/ConstantineLignos/1219749" target="_blank">https://gist.github.com/ConstantineLignos/1219749</a><br> </div><div style="font-family:arial,sans-serif;font-size:12.727272033691406px"><br></div><div style="font-family:arial,sans-serif;font-size:12.727272033691406px">two words "sound alike" if they have some specified edit distance between their two pronunciations. e.g. one phone apart, or some more complicated measure.<br> <div><br></div><div>C</div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, May 26, 2014 at 11:55 AM, Michael Rogers <span dir="ltr"><<a href="mailto:michael@briarproject.org" target="_blank">michael@briarproject.org</a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">-----BEGIN PGP SIGNED MESSAGE-----<br> Hash: SHA256<br> <div class=""><br> On 26/05/14 01:15, Tom Ritter wrote:<br> > Third: Figure out how to approximate an attacker who can perform<br> > 2^80 calculations in the 'weird' cases. For a 32-character hex<br> > fingerprint, a 2^80 attacker can match 20 characters.<br> ><br> > Weird Case 1: An attacker matches the beginning and end parts of<br> > the fingerprint to try and trick someone doing a visual compare.<br> > Clearly, matching the beginning and ending 10 characters exactly is<br> > harder than matching any 20. but how much harder? Would a match of<br> > the beginning and ending 8 characters correctly characterize a 2^80<br> > attacker?<br> <br> </div>As I've mentioned before, I don't think we can make a fair comparison<br> of 'weird' attacks across fingerprint representations.<br> <br> Having said that... a 2^80 attacker can match 20 characters at chosen<br> positions. I don't know how to calculate how many characters a 2^80<br> attacker could match at unchosen positions, but it seems to me that it<br> would depend on the number of positions, i.e. the length of the<br> fingerprint.<br> <div class=""><br> > Weird Case 2: An attacker tries the match the fingerprint by<br> > pronunciation to try and trick someone doing a vocal compare.<br> > Again, matching 20 characters exactly and making the remaining 12<br> > 'sound alike' is harder than just matching 20. Would an attacker<br> > getting 28 characters to 'sound alike' and have the rest match<br> > exactly approximate a 2^80 attack?<br> <br> </div>We don't even have a metric for 'sound alike', so this question isn't<br> well-founded.<br> <br> Cheers,<br> Michael<br> -----BEGIN PGP SIGNATURE-----<br> Version: GnuPG v1.4.12 (GNU/Linux)<br> <br> iQEcBAEBCAAGBQJTgw+IAAoJEBEET9GfxSfMF08H+wWrntqdVbKp34QbtcQoGe4W<br> uCKggnCp1rJvWqcJ8V/FaOpOqvneXPL1ttl4TWn+hA1p+7tObz8R9gQDrtdqrdrH<br> 9E4tOSLrCtGpGL9p8kAGfEHIzoXi4lTZO6dLiolI6VR7KgiKjHsBA61wWpYtfVyK<br> i7vL/k7H+vi1HqnfwptRNet9gzC5bFZauSnMp+/Zc/pYd5ucQpbABBA+8vETaC7R<br> IeX1fQChREgxVD2UURclr2EqLHBSVbSxtGeKtHuENkyI8VljwKYJe3mMmnkMhsLS<br> hdnOjjKN8lYSCSh7maxWfIPSqfchC9FmOUDq+6qhhVOxaSC/QvIhTidsGRpq074=<br> =UIW+<br> -----END PGP SIGNATURE-----<br> <div class="HOEnZb"><div class="h5">_______________________________________________<br> Messaging mailing list<br> <a href="mailto:Messaging@moderncrypto.org">Messaging@moderncrypto.org</a><br> <a href="https://moderncrypto.org/mailman/listinfo/messaging" target="_blank">https://moderncrypto.org/mailman/listinfo/messaging</a><br> </div></div></blockquote></div><br></div>