[curves] X25519 and zero outputs
trevp at trevp.net
Mon May 1 08:57:22 PDT 2017
The X25519 DH function (aka "Curve25519" from 2006) handles all input
public keys without returning errors, regardless of whether the input
is fully reduced mod the field prime, on the correct curve, or
generates the correct subgroup.
In 2016 the IETF published a spec for X25519 (RFC 7748). This spec
allows protocols to abort if X25519 returns an all-zeros output (but
doesn't require this). The all-zeros output occurs when X25519
processes one of several inputs that generate a group of small order
on either the correct curve or its twist.
An effect of this RFC has been public uncertainty about how to use
X25519. Some people have claimed this check is important in all
protocols, and it's a "vulnerability" if not present.
I disagree. I also think the best practice is to use X25519 the way
it was originally designed and widely deployed over the last 10+
Below I'll discuss what this check accomplishes, and why I think it
should not be used.
But first I'll consider a recent essay by J.P. Aumasson which argues
the other side [JP]. I encourage people to read that, then the
"Should Curve25519 keys be validated?" - Critique
The essay's main argument is its title. Input validation is like
motherhood and apple pie, who doesn't support it?
However, this misrepresents the check the essay is actually proposing,
the complexity of this topic, and the painful history of DH
validation. Some context:
(1) The proposed check has the goal of blacklisting a few input
values. It's nowhere near full validation, does not match existing
standards for ECDH input validation, and is not even applied to the
(2) Confusion about DH input validation and DH semantics has been a
major problem for decades, and is responsible for many broken
protocols and implementations.
(3) X25519 was designed to avoid confusion by providing an extremely
simple API that could not be misused by omitting a check or ignoring a
return value. This simplicity is a large part of its success.
(4) The proposed check was a CFRG compromise whose main goal (AFAICT)
was to placate advocates of previous DH standards. The check received
little discussion or support in [CFRG], and remains controversial.
I'll expand on the first two points:
(1) With all the talk of "validation", the reader of JP's essay is
likely to think this check is equivalent to "full validation" (e.g.
[SP80056A]), where only valid public keys are accepted (i.e. public
keys which uniquely encode a generator of the correct subgroup). In
that case we'd have the simple model of prime-order DH from many
academic papers, where interaction between invalid public keys and
legitimate private keys is strictly prevented.
But that's a wrong and dangerous assumption. Even with the zeros
check, X25519 accepts most invalid inputs:
* The input and output points might not be on the correct curve (they
might be on the "twist").
* Valid points have equivalent "invalid" representations, due to the
cofactor, masking of the high bit, and (in a few cases) unreduced
Additionally, this check performs "input validation" only AFTER
combining the input with the private key. Resistance to
small-subgroup, invalid-curve, and side-channel attacks against the
private key is NOT provided by this "input validation", and must be
provided in the DH function itself.
A case could be made for full validation of public keys. However,
it's misleading and dangerous to confuse this zeros check with such
(2) In case anyone thinks DH/ECDH key validation is a simple and
obvious "best practice", consider the complexity:
NIST's SP800-56A (revision 2) is a 2013 spec for DH and ECDH. It
defines "full validation" as 4 checks for EC public keys, and 2 checks
for non-EC "finite-field" public keys. It defines "partial
validation" that omits the most expensive check for ECDH public keys.
It also provides complicated guidance (which has changed between
revisions) for choosing between full validation, partial validation,
and no validation in different situations (184.108.40.206.1 and 220.127.116.11.2).
Despite all this, the NIST spec doesn't discuss how to encode or
decode elliptic curve public keys (e.g. as compressed points), which
in a specification like [SEC1] adds another layer of checks.
Partial validation is allowed for ephemeral EC public keys even when
used with static private keys, since the only ECDH scheme in the NIST
spec uses "Cofactor Diffie-Hellman". But if one were to combine
partial validation with (non-cofactor) "Elliptic Curve Diffie-Hellman"
as defined in SEC1 and other standards, it could leak bits of the
Partial validation isn't specified by NIST for finite-field DH, for
some reason. Instead, "full validation" always checks that a public
key generates the proper subgroup. That's less efficient in some
cases (e.g. "safe primes" p=2q+1, i.e. with cofactor=2), but seems
conservative. What could go wrong?
Well, by relying on this check to prevent small subgroup attacks,
SP800-56A recommends finite-field DH using primes with *large*
cofactors. Many protocols use safe primes with cofactor=2, in which
case simple bounds checking can rule out zero and small subgroups, so
the more expensive full validation check isn't necessary, and isn't
RFC 5114 referenced the NIST document and proposed some large-cofactor
primes for protocols which traditionally used cofactor=2 primes. But
the RFC forgot to mention that different validation is needed! The
"Measuring small subgroup attacks" paper from February explains the
widespread security flaws this caused [SUBGROUP].
You might look at this as an example of missing input validation, and
argue that maximal input validation is always the best choice. But JP
isn't proposing that, the check he *is* proposing doesn't address this
specific issue, and the check that does address it has a performance
cost for cases like X25519 or safe primes where small-subgroup attacks
on the private key are ruled out by design.
Confusion like this has been happening for years, and is a big reason
the simple X25519 function, with no validation needed, is so popular.
Besides "input validation", the essay makes a few other points:
* It argues that "The point of Diffie-Hellman is that both key shares
should equally contribute to the shared secret". The more common view
is that the point of DH is to perform key agreement based on the
* The essay argues that "key control" is "a desirable attribute of
any authenticated key agreement protocol", citing a brief mention in
an old paper. But the modern focus is on key indistinguishability;
"key control" and "contributoriness" are rarely mentioned. (An
exception might be group key agreement, but even there these
requirements are artificial, as George Danezis has explained
* The essay argues that an added check is "costless", and "ten lines
of code tops". But by the time you write a const-time check, plumb
the return values and abort logic through layers of code, and add
testing, it is of course more. In libsodium this change touched ~80
lines [LIBSODIUM]. If your application involves multiple libraries
and platforms, that effort is multiplied. (More on other costs,
* The essay claims a "non-obvious attack" against Signal where Alice
chooses insecure keys for herself so that messages encrypted to her
use insecure keys. Of course, it's absurd to consider Alice attacking
The essay tries to dress this up by saying that Alice could "deny
being malicious, arguing that her PRNG failed". This is still absurd,
and also wrong: Curve25519 key generation uses scalar multiplication
with a private key "clamped" so that it will always produce a valid
public key, regardless of RNG behavior.
Safe DH protocols
Let's step back and consider what a DH function like X25519 should
actually do, and how protocols should use it.
I think the goal of a DH function is simple:
* A DH between honest parties who choose their key pairs at random (a
"good DH") should output a shared secret. Security is based on the
hardness of the DH problem.
* If a DH involves one honest party and one dishonest party (a "bad
DH"), then no assumptions about the output should be made, except that
it doesn't trivially break security by producing a low-entropy
function of the honest party's private key (or any other secret), e.g.
via small-subgroup or invalid-curve attack.
X25519 is very close to this ideal, with the exception that public
keys have easily-computed equivalent values. (Preventing equivalent
values would require a different and more costly check. Instead,
protocols should "bind" the exact public keys by MAC'ing them or
hashing them into the session key.)
A protocol that is secure given a DH function that meets the above
criteria, and can tolerate equivalent public keys, I'll call a "safe"
A safe DH protocol is easy to instantiate with a wide range of DH
algorithms, and in many cases non-DH key agreements (e.g. post-quantum
algorithms, and encryption like RSA).
We can review a safe DH protocol by verifying that its security only
relies on good DHs. For example, a key agreement with signed
ephemeral DH public keys is a safe protocol, as is ephemeral-static DH
for encryption to a static public key. As another example, in X3DH:
* Authentication of an honest remote party is based on a good DH
involving a local key and the remote party's identity key.
* Forward secrecy for passively-attacked sessions relies on good DHs
* Forward secrecy for actively-attacked sessions relies on a good DH
involving the signed prekey.
Unsafe DH protocols
An "unsafe" DH protocol relies on "bad DHs" (as defined above) in some
way. In other words, it performs a DH with an attacker yet still
expects the result to have some property. Asking a DH function to
have properties unrelated to the hardness of DH is asking for trouble.
I've heard of two unsafe cases where it's been suggested this check might help:
Unsafe Idea #1: Using DH outputs for channel binding
Suppose two parties execute an unauthenticated DH, then hash the DH
shared secret to get a session key. Then one of the parties tries to
authenticate by sending an authentication message which signs (or
otherwise binds) the session key, hoping to prevent the authentication
message from being used in a different context.
However, the DH is unauthenticated! There is no guarantee that it is
a "good DH". This protocol is making unjustified and unsafe
assumptions about the session key (that it is unique to this session;
and binds all relevant protocol context).
This is related to the "Triple Handshake" attack on earlier versions
of TLS, and also to Thai Duong's example [THAI]. It's well-understood
nowadays that channel binding must cover the session transcript.
Adding a zeros check instead *might* help, by preventing a MITM
attacker from synchronizing two different session keys. But it would
not prevent an attacker from tampering with other protocol context,
such as transmitted identifiers, protocol negotiation, etc.
Adding a check to a single DH function also wouldn't protect a
protocol like TLS that's used with different DH options, RSA, PSK,
SRP, postquantum, etc. Thus it's an incomplete fix, and the correct
solution is binding the transcript.
Unsafe Idea #2: Using DH public keys as secrets
Imagine a server-authenticated key agreement, as follows:
(a) The client encrypts a DH public key X=g^x to a server's long-term
RSA public key.
(b) If X decrypts successfully the server responds with a DH public
key Y=g^y, and hashes g^xy to get a session key.
The client is relying on the server's unauthenticated DH public key Y
to somehow authenticate the server's knowledge of X. Obviously, this
is making an assumption about a DH that could be bad, thus is an
This is Tor's (older) TAP circuit handshake (using regular DH, not
ECDH). The original deployment was easily attacked by a fake server
sending a public key Y = 0, 1, or -1, thus allowing the fake server to
calculate Y^x without seeing X [TAP]. This attack was patched by
rejecting these values, though the advisory's shotgun approach
"The current Tor release refuses all keys with less than 16 "0" bits
set, with less than 16 "1" bits set, with values less than 2**24, and
with values more than p - 2**24. This is a trivial piece of the
overall keyspace, and might help with next
year's weak key discoveries too."
Anyways, I think this partially motivated the idea of an X25519
zeros-check, to similarly reject inputs which would make the DH
However, consider the fragility of this check: It only works because
the DH is using a safe prime with cofactor 2. Using a prime that
results in larger cofactor, as advocated by NIST SP800-56A or RFC
5114, likely breaks it again (despite the shotgun approach).
Unsurprisingly, a protocol this clumsy has other flaws, e.g. 1024-bit
DH keys. Another one: To save space the protocol fits the first 70
bytes of X inside the RSA-OAEP encryption. The next 58 bytes of X are
outside the RSA-OAEP encryption but encrypted with an AES-CTR key
stored inside the RSA-OAEP, with no authentication on the CTR
ciphertext. This raises the possibility of an attacker tampering with
the ciphertext bits of X and exploiting server validation of X as a
"verification oracle" that leaks information about X.
Ian Goldberg managed to prove security here [GOLDBERG], but only by
replacing the shotgun validation with a precise check (and careful
analysis) for 1 < Z < p - 1 (for public key Z encoded in big-endian,
and safe prime p). This check strikes a delicate balance: It checks
Y sufficiently to prevent forgery of a (Y, Y^x) pair without knowledge
of X, but the rejected values for X are unlikely to be hit by an
attacker flipping ciphertext bits in the least-significant portion of
X. Stricter checking could easily *WEAKEN* security, e.g. the
NIST-mandated subgroup check would provide an oracle on whether a
tampered X was square or nonsquare. Simply using little-endian
encoding and a prime less close to a power of 2 could leak much of X
via the bounds-check.
In sum: If one is determined to use a DH public key as a secret, it
should be used directly (e.g. hashed into the session key), instead of
trying to add checks that turn a DH function into some sort of
quasi-hash while not leaking too much. Even better, a proper key
agreement like Ntor, Noise, or TLS should be used (Tor has moved to
Costs of a zero check
Why *not* do this check? What does it cost? I think quite a bit:
* Many X25519 implementations don't implement this check. If we
encourage protocol designs that require this check, these protocols
will be used with existing implementations, and will be broken.
* Protocols that blacklist a few special values are fragile, and
easily broken by small changes (e.g. a different DH prime, a different
curve, a different encoding, or a different key agreement).
* The bad protocol designs above are likely to be insecure even with
this check. Those protocol designs have resulted in major real-world
security flaws. The users of those designs (Tor, TLS) have abandoned
them and moved to safer techniques.
* Protocols that depend on return values are often broken if return
values are ignored.
* Protocol designers need a clear understanding of the contract with
their primitives. The requirement to "only rely on DH between honest
parties" is simple, is mandatory for reasoning about key secrecy in
any case, and produces safe protocols when it is the main assumption
used. Properties like "resistance to key control",
"contributoriness", and "DH input validation" are easy to
misunderstand, thus will lead to mistakes and broken protocols when
designers try to rely on them.
* There's significant confusion over where this check should be
implemented. According to RFC 7748 it's a caller responsibility, and
not part of the X25519 function. According to OpenSSL it's part of
X25519. In libsodium it's an undocumented feature in
"crypto_scalarmult", but it's not part of NaCl's "crypto_scalarmult",
which libsodium claims API compatibility with. This confusion makes
it likely there will be mismatches between expected and implemented
* Rigid definition of crypto primitives is important for protocol
design, testing, formal verification, modularity, and anonymity uses.
If X25519 implementations MAY do whatever input validation they feel
like, then we can expect a wide range of implementation choices (e.g.
rejecting points off the curve; or off the main subgroup; or unreduced
coordinates). This will cause more mismatches between expected and
* Additional code and logic means more opportunities for flaws (e.g.
bugs in new code; non-constant-time branching on secret bytes; program
crashes or logic bugs due to an untested error return; information
leakage due to different behavior on public keys or during validation;
So what should we do?
It's great that people care about DH protocol quality and robustness.
The main problem in this area is confusion around DH validation and DH
semantics. To improve this we should focus on clear and simple
advice, safe protocols and frameworks, and education about safe
protocol design. X25519's simple interface is a major step in this
Muddying the waters by trying to redefine such a widely-used function,
and encouraging confusing and inconsistently-implemented DH semantics,
and fragile protocol designs, would be a step backwards.
(thanks to DJB, Moxie, and Peter Schwabe for comments)
More information about the Curves