<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Jan 30, 2014 at 12:04 PM, Robert Ransom <span dir="ltr"><<a href="mailto:rransom.8774@gmail.com" target="_blank">rransom.8774@gmail.com</a>></span> wrote:<br> <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class="im">On 1/30/14, Paulo S. L. M. Barreto <<a href="mailto:pbarreto@larc.usp.br">pbarreto@larc.usp.br</a>> wrote:<br> > Hello to all,<br> ><br> > On Wed Jan 29 18:07:34 PST 2014, Robert Ransom <<a href="mailto:rransom.8774@gmail.com">rransom.8774@gmail.com</a>><br> > wrote:<br> ><br> >> My main problem with the ‘Brazil’ curves is that all of them except<br> >> M-221 (even the E-* curves) have really *ugly* coordinate fields.<br> >> They make the NSA fields look nice by comparison (and at least those<br> >> would have the advantage of requiring less extra hardware within a<br> >> TPM, as someone mentioned on one of the IETF lists<br> ><br> > I'm just now coming back from vacations so this message is very brief, but<br> > you<br> > got my attention. Could you please state your 'beauty' metric<br> > quantitatively?<br> <br> </div>Curve operations modulo a number of the form 2^b - k are easy to<br> implement efficiently if there is a regular arrangement of limb sizes<br> small enough that an operation of the form (a+b)*(c+d) can be<br> performed without carrying the inputs to the multiplication, and with<br> reduction modulo x^l - k performed during the multiplication, for both<br> signed 64-bit and unsigned 128-bit multiplication-result lengths.<br></blockquote><div><br></div><div>Dear Robert,</div><div><br></div><div>Your comments are very relevant, but let me justify some of our design choices. We picked field sizes similar to NIST curves, trying to provide something closer to drop-in replacements. Additionally, we considered not only vector or hardware implementations, but also the fast integer multipliers already available to software implementations in many platforms. Of course, these could require specialized assembly-language multipliers for optimal performance. You can find some brief notes below.</div> <div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">* Curve3617 coordinates (mod 2^414 - 17) can be represented in four<br> reasonable ways: a uniform sequence of 9 46-bit or 18 23-bit limbs<br> (trivial to implement), two repetitions of (52,52,52,51)-bit limbs<br> (faster than the 9-limb representation), or two repetitions of<br> (26,26,26,26,26,26,26,25)-bit limbs (faster than the 18-limb<br> representation, and still (just barely) safe). Carries are<br> vectorizable for all of these representations.<br> <br> * E-382 coordinates (mod 2^382 - 105) have a vectorizable 16-limb<br> representation for 32-bit processors (same shape as that for<br> Curve3617, with two fewer bits per limb), but that's as long as for</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> Curve3617, and to obtain a lower security level. Because 105 is so<br> large, even a 15-limb representation requires that reduction modulo<br> x^15 - 105 be separated from multiplication, and carries do not<br> appear to be easily vectorizable.<br></blockquote><div><br></div><div>The difference in security levels is rather small, only ~16 bits, which is less significant at the 192-bit level. However, conventional non-vectorized implementations would be penalized by the extra limb in Curve3617. If reduction gets too expensive, we have more room for lazy reduction.</div> <div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"> * M-383 coordinates (mod 2^383 - 187) are worse than E-382: no obvious<br> way to vectorize carries for the 16-limb representation, and 187 is<br> even bigger than 105.<br></blockquote><div><br></div><div>Similar to the above, but with less room for lazy reduction.</div><div><br></div><div>Best,</div><div>--<br>Diego de Freitas Aranha<br>Department of Computer Science - University of Brasília<br> <a href="http://www.cic.unb.br/~dfaranha" target="_blank">http://www.cic.unb.br/~dfaranha</a></div></div></div></div>