[curves] Finalizing XEdDSA

Wed Nov 2 13:28:20 PDT 2016

Thanks for the close read -

On Tue, Nov 1, 2016 at 11:52 PM, Brian Smith <brian at briansmith.org> wrote:
> Trevor Perrin <trevp at trevp.net> wrote:
>
>>
>> (2) Replace hash_i(a || ... || Z) with hash_i(a || Z || pad || ...)
>> for reasons here [2] - mainly a bit more sidechannel resistance, and
>> slightly cleaner use of the hash.
>
>
> This seems good to me. I am not convinced that "pad" is needed, and also if
> it is needed then should it be random too?

64 bytes of randomness ought to be enough for anyone.

With Curve25519, Curve448, and SHA-512, you're right that padding
isn't totally necessary:
 * The padding is zero-length for 25519, since hash_i(a || Z) is 128
bytes (one SHA-512 block)
 * For 448, hash_i(a || Z) spills into the second SHA-512 block, so
the private key and message will be in separate blocks regardless of
padding.

But it's still a little cleaner to pad the full block in 448, and for
other hash functions or curves things might work out differently, so I
think specifying padding instead of relying on things to line up is
the safest design.

> The motivation for VXEdDSA is unclear. Is the purpose simply to give the
> verifier a proof that the EdDSA keypair really was derived from the
> X25519/X448 key, that is indistinguishable from random to a third-party
> observer? Is it also serving to provide proof-of-possession of the private
> key, or any other purpose?

Other purposes - namely to be a VRF, per references (Micali, Dodis).

You might think:  What are VRFs for?  People have proposed various
uses, hopefully I'll have more to suggest later.

> hash_0(x) is reserved for other specifications. But, if hash_0(x) is never
> used, then it is clear that Curve25519 and Curve448 are fully domain
> separated, right?

I was thinking about domain separation with respect to the same key.

With respect to separating hash functions for different types of keys,
I'm not sure that's important, but it's worth mulling over a bit more.

> (The 32nd input byte to the hash function will always be
> 0xff for Curve448 and never 0xff for Curve25519, for i > 0). Conversely, if
> something does use hash_0(x) then will the domain separation actually work?
> It seems like, instead of reserving hash_0(x) for other specifications, it
> might be better for it to be forbidden from being used?

Letting other specs use it means they have an easy way to get domain
separation, if they're also doing something with the same key.

> In this paper, should we also consider the EdDSA public key to be a secret
> worth protecting?

Where does that make a difference?  You could do const-time
verification, I suppose?

> In VXEdDSA, will the call to hash_to_point change from hash_to_point(A || M)
> to something similar to the proposed change for the first hashing in XEdDSA?
> (I guess this may depend on whether the EdDSA public key is supposed to be
> secret.)

This can't be randomized, it needs to be deterministic.

There's not a very strong argument for padding, either, since we're
not trying to make this a PRF based on A (hashing A just prevents the
possibility of different keys that are equivalent in their VRF
outputs, which isn't technically a VRF attack, but seems confusing
enough that it's worth preventing).

> In VXEdDSA, every use of a hash function is a hash_i(x). But, in XEdDSA, the
> first use of the hash function is hash_1(x) but the second use of the hash
> function is just hash(x). Why isn't it hash_2(x) instead for the second
> hashing?

Compatibility with EdDSA.

> In XEdDSA and EdDSA, near the end we calculate:
>
>     h = hash(R || A || M).
>
> But in VXEdDSA we calculate:
>
>     h = hash_4(A || V || R || Rv || M).
>
> Why doesn't VXEdDSA use this instead?:
>
>     h = hash_4(V || Rv || R || A || M)
>
> Note that that is equivalent to:
>
>     h = hash(P || R || A || M) where P = 2**b - 1 - i || V || Rv

The order doesn't matter.  I see your point about aligning with EdDSA,
but the current order somewhat matches the "order of appearance" of
the variables.  Since this is arbitrary, I'm inclined to leave it
as-is.

> Section 2.4 explains how elliptic curve points are encoded, but it isn't as
> precise as the IETF draft. In particular, it isn't clear whether the
> coordinates of P must or must not be reduced (mod p) before encoding, but
> this is important because the results of point multiplications are hashed.
> In the IETF draft for EdDSA, this is clearer because it has a precondition
> that 0 <= x, y < p. If the goal is to be aligned with EdDSA then I suggest
> defining the encoding rules by reference to the IETF draft for EdDSA.

Coordinates are defined as being mod p, but I can add a (mod p) in
there to be clear.

> It would be good to have some test vectors, where RNG outputs are given as
> inputs.

For sure, may not get to that in next round though.

Trevor