[curves] Second day NIST workshop notes

Michael Hamburg mike at shiftleft.org
Fri Jun 12 12:08:08 PDT 2015

Elliptic Curves: a Hardware Perspective

EC widely used in hardware.  Lots of smart cards, passports, banking cards, DTCP.  Software folks think hardware is the same as software, but it isn’t.  Size and maintainability are king.  Many sorts of side-channel and fault attacks.

Would be nice if new curves support a=-3.  Would be even nicer if prime order.  Would be nice if sqrt(b) doesn’t exist.  Unfortunately with curve25519, sqrt(b) does exist in short Weierstrass form and a=-3 not possible.

Special primes not ideal.  Doesn’t gain anything in a typical hardware impl because they just use a generic multiplier.  Also need more blinding for scalars.

DJB flaks talk because some CHES papers recommend Montgomery curves for hardware.

Efficient side-channel attacks on scalar blinding on elliptic curves with special structure

Manfred Lochter presenting in place of authors.  Main topic is side-channel attacks on special prime fields.  Basically blinding scalars by adding r*q requires large r, because the top bits of q have a special form (by Hasse-Weil).  This paper demonstrates how to attack small r with a side channel that reveals (with high-ish probability, eg 75%) the bits of the blinded scalar.

The attack is amplified if the implementation doesn't perform twist checks and will give the point output for a twisted input.  Running the algorithm several times on the same twisted point gives (x+rq)P for several r, but P doesn’t have order q.  This allows the difference of the r’s to be recovered which amplifies the attack, allowing error rates of 40-something%.

DJB claims in a question that sufficient blinding is the number of consecutive 0’s or 1’s in r.  (Or presumably some minimum value for entropy reasons.)

The fastest Curve25519 implementation ever

Turns out that on Sandy Bridge, curve25519 using the vectorized multiplier is something like 20% faster than the scalar one for curve25519.  The reason is that the MUL instruction takes 2 uops, but VPMULUDQ takes only one and thus parallelizes with VPADDQ.  In principle, serial multiplication could still be faster, but in practice serial multiplication doesn’t pipeline very well.  So doing two multiplies at the same time in the vector unit can be faster than in the scalar unit.

Efficient and Secure Elliptic Curve Cryptography Implementation of curve P-256 

Goal: secure BGP.  Use ECDSA for the signing.  Throw lots of cores at it.  Have one core computing k,k^-1 (in constant time) with FLT.  Another core computes R=kG.  Use precomputed signature coupons for low latency.  Some description of low-level impl, suggesting Broadwell ADCX/ADOX.  Mont reduction after multiply.  Claims significantly faster perf (on Haswell) than Gueron+Krasnov.  Not entirely clear why this is… maybe bigger precomputed tables? (40kB/60kB).

An Analysis of High-Performance Primes at High-Security Levels

Another defense of the NUMS curves.  Main point is that performance is platform-dependent.  Eg, the SUPERCOP submitted Goldilocks does not perform well on AMD.  Anyway the difference is small enough (a couple 10s of %) that MS prefers their “rigid” generation method.


My talk.  Overview of Goldilocks, why the funky prime, why it’s fast, implementation stressing usability and cofactor removal.

Curve41417: fast, highly secure and implementation-friendly curve

Best speed at acceptable security: Curve25519 or maybe a Kummer curve (or similar).  What about best security at an acceptable speed?  Claim: curve41417.  Went over how it’s implemented.

FourQ: Four-dimensional decompositions on a Q-curve

Want to apply all existing tricks to get a fast curve without compromising security.  Uses GF((p^127-1)^2).  Cofactor is 392 = 2^3 * 7^2.  Complete formulas.  Discriminant -40.  Two endomorphisms: CM and Frobenius (I think?) gives 4-dimensional decomposition.  Roughly 122-bit generic security.  Claim is that unlike GLV and GLS, the endomorphisms don’t help rho, because they’re slightly slower.  Extension field is only quadratic so no Weil descent.  Recoding in constant time.

CFRG report

CFRG discussion was long and heated.  Many desirable options guarantee that someone will go home sad.  Decided on a curve generation technique mod a prime.  Decided on Curve25519, Ed448-Goldilocks.  Haven’t decided on a signature scheme yet.

Comments: Ben Black says that CFRG failed at its charge.  Another commenter said that CFRG’s timeline to a decision was too fast.

Panel: Standardization efforts

Didn’t take complete notes on this.  Farrell: don’t want to make IPR worse.  Also if NIST standardizes new curves quickly, CFRG will probably drop their recommendations.

Farrell and Struik: curves should support both signing and DH.

Won’t throw away old curves, at least not ones anyone uses.

Farrell: maybe if a competition is to be run, it’s better for cryptographers to contribute but not cryptographers (eg, NIST would be OK) to run it.

NIST discussion:

DJB: Rumor has it that the original NIST curve seeds are hashes of English sentences.  Perhaps they could run a preimage search?

Could regenerate NIST curves with trustworthy seeds.  Wouldn’t solve all the other issues with the old curves, where Edwards curves are better.

Could use new curves with old algorithms.

Should have an accountable body making critical crypto/business decisions and not CFRG.

Rene: Perhaps curves doc should be separated from sigs / other mode of operations doc.

— Mike

More information about the Curves mailing list