[curves] Balancing reduced-radix and full-radix performance for extra-strength primes
trevp at trevp.net
Tue Jan 20 12:42:04 PST 2015
On Mon, Jan 19, 2015 at 8:09 PM, Mike Hamburg <mike at shiftleft.org> wrote:
> The point of the Goldilocks prime is that Karatsuba is a win at that size,
> and Karatsuba is a pain on full-radix. So even though Ed448 is suitably
> aligned for a full-radix implementation on 32 bits, that would take 14^2 =
> 196 muls vs 16^2 * 3/4 = 192 for Karatsuba. The extra adds for Karatsuba
> cost less than the carry handling for full-radix unless you have some sort
> of accelerator, and I don't think UMAAL is enough of an accelerator to make
> up that difference. 64-bit is similar except the coefficients aren't
> aligned. So it's almost always a win to use Karatsuba and reduced-radix.
Thanks for the analysis. It seems the crux of your argument is that
Karatsuba (and other fancy multiplications) favors reduced-radix on
many of the platforms where Longa was arguing for full-radix?
If you have time it might be educational to explain why Karatsuba
works better for reduced-radix.
> This analysis doesn't support Longa's point, since it shows cases where
> reduced radix is very fast, even faster than full-radix on a
> comparable-sized prime on a machine where that's fast.
But if a 448-ish prime was chosen to take into account full-radix
(maybe not as close to a power of 2 as Goldilocks?), that might change
Even if Goldilocks has similar performance to other 448-bit primes on
platforms that favor full-radix, that's a weakening of the Goldilocks
argument that it's getting "extra" security margin over 384 very
cheaply: it would be getting this "extra" margin, on these platforms,
at full price.
More information about the Curves