[curves] Ed448-Goldilocks (sponges and collision resistance)

Wed Jul 16 17:46:16 PDT 2014

> Certainly not if you hash the message first.  That drops security to 128
> bits vs collision attacks.  Even if those attacks aren’t realistic, that’s
> pretty far below the design security of the system.
The use of sponge functions in keyed modes

So, as I understand it, this isn't true for sponge functions: a collision
in message hashs does not imply a collision in nonce-postfixed message
hashs.

Here's why (very informally):

(Apologies for the detail, but I figure that it might be useful to users on
the list less familiar with this area.)
Background: Merkle-Damgard

*Merkle-Damgard.* Each update step consumes a message block and outputs an
IV for the next message block.

Thus, if H(m) == H(n), then H(m || g) == H(n || g). (So a message collision
implies a collision in nonce-postfixed messages if len(m) == len(n).)

So, this is why, if I did this reordering with SHA2-512, the resulting
signature scheme would not be collision-resistant.

*Sponge functions.* Just to recall the definitions, in Python-like
pseudocode for clarity (and omitting the padding rule),

class Sponge(object):
  def __init__(permutation, rate):
    state = zeros(permutation.blocksize)
    capacity = permutation.blocksize - rate
    position = 0
  def absorb(bytes):
    i = 0
    while i < len(bytes):
      if position == (rate - 1):
         permutation(state)
         position = 0
      state[position] ^= bytes[i]
      i++; position++
  def squeeze(length):
    i = 0
    out = ''
    while i < length:
      if position == (rate - 1):
        permutation(state)
        position = 0
      i++; out += state[position]
    return out

So, note what's happening here: Each block of the message is absorbed into
at most *rate* bytes of the sponge. Every time rate bytes is filled, the
permutation is applied. When the sponge is squeezed, at least *capacity* bytes
of the sponge's state is hidden.
Sponge functionsHash collisions

Let's define a hash collision in the usual way; two messages for which the
hash output is the same value. So, here's how a sponge derives a hash:

msponge = Sponge(keccak, 200 - 64)           # shake128
msponge.absorb(message)
mhash = sponge.squeeze(64)

(And so, in this case, the resulting hash has 256-bits of collision
resistance.)

So, if what I did with the sponge was this,

csponge = challenge_sponge = Sponge(keccak, 200 - 64)
csponge.absorb(mhash)
csponge.absorb(gnonce)
challenge = sponge.squeeze(64)

a message collision *would*, just as in the MD-case, imply a challenge
collision.

(Call the colliding message mess.)
Collision-resistant signatures

But here's what's happening in the code (simplified to omit the pubkey and
DS):

sponge = Sponge(keccak, 200 - 64)  # shake256
sponge.absorb(message)
sponge.absorb(gnonce)
challenge = sponge.squeeze(64)

(The sponge retains its full 200 byte state between absorb calls.)

If

(sponge.absorb(message).absorb(gnonce)
 == sponge.absorb(mess).absorb(gnonce))

then the states must collide, i.e.,

sponge.absorb(message).state == sponge.absorb(mess).state

But the state is 200 bytes; the probability of a message that produces a
*hash* collision also producing a *state* collision is extremely small.

(There are obviously generic attacks with cost 2^512 in this case that find
a colliding squeezed output.)

So this is essentially the argument for Shake256 providing 2^512 security
strength in this mode of operation; and for Shake128 providing 2^256
security strength.

I, too, am somewhat conservative: Shake256 is fast enough for most
applications. (It is, I believe, still somewhat faster than SHA2-512.) So,
happy with that.

And I'll follow up with some citations to the security proofs that are
less-handwavy, but more mathemetical in the next couple of days.

So...does this make sense?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://moderncrypto.org/mail-archive/curves/attachments/20140716/d7b01b27/attachment.html>