[curves] Decaf: simple ECC for groups of prime order p
mike at shiftleft.org
Thu Jan 29 15:28:41 PST 2015
I’ve made a rough cut of an API and simple library for groups of order p with cofactor removal, called Decaf. Please tell me if this looks like a useful API, what changes you would make to it, and whether you think the project is generally worthwhile. Also, I’d be curious how you think it compares to other EC APIs.
The header is at http://sourceforge.net/p/ed448goldilocks/code/ci/decaf/tree/include/decaf.h <http://sourceforge.net/p/ed448goldilocks/code/ci/decaf/tree/include/decaf.h>
The implementation is at http://sourceforge.net/p/ed448goldilocks/code/ci/decaf/tree/src/decaf.c <http://sourceforge.net/p/ed448goldilocks/code/ci/decaf/tree/src/decaf.c>
The goal is a compromise between something like TweetNaCl and a full implementation. The code should be simple enough to be easily audited. But it makes a few compromises in favor of speed over simplicity, such as w=2 instead of w=1 on scalarmul. Furthermore, the API is designed so that you can make it fast.
The API and library are not quite done yet. They need more testing, 32-bit cleanliness, and most importantly your feedback. The Decaf library implements only scalar and group operations. Another section is coming later which will include keygen, signing, verification, etc. I plan to use SHAKE as the hash function for these.
The curve in the library is currently Ed448-Goldilocks. I’m thinking that I should modify the prefixes (decaf_448_* instead of decaf_*) to allow other curves, such E-521, a hypothetical 379- or 389-bit curve, and/or the twist of Curve25519.
Overall, large operations (scalarmul) are about 1/3 to 1/2 as fast as with the Goldilocks codebase on Haswell. This is entirely due to optimizations in the Goldilocks codebase which are not in Decaf. Decaf is written entirely in C with no intrinsics or vector operations, and isn’t using a dedicated field square, dedicated point double, lazy reduction or (currently) even Karatsuba. It doesn’t use a readd form (extensible+pniels) in its scalarmul, instead using extended points everywhere. Its scalarmul uses a window of 2 instead of 5 for simplicity. However, the API is designed to allow fast implementations.
Ironically, the Decaf Montgomery scalar code is much simpler than Goldilocks’ half-assed Barrett scalar code, and faster too, so I’m planning to replace the Goldilocks implementation with the Decaf implementation.
Possible missing functions:
Decodes a scalar which may be longer (or shorter) than the default length, for RNGs or hashes.
decaf_scalar_invert: returns (0,failure) on 0.
Undo Elligator; needs extra arg to choose which possible output; may fail.
decaf_point_precompute // decaf_point_precomputed_scalarmul:
Combs. Maybe the API is allowed to not precompute if it doesn’t support it?
— Mike Hamburg
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Curves