[noise] Stateful Hash Object Proposal

Fri Nov 23 23:25:41 PST 2018

Hey Peter,

Glad you're here!, I wanted your feedback, see below:

On Thu, Nov 22, 2018 at 11:03 AM Peter Schwabe <peter at cryptojedi.org> wrote:
>
> Some post-quantum KEMs need multiple hash functions, i.e., they would
> need such a StatefulHashObject to be domain-separated. Is the idea that
> the caller needs to first absorb the domain-separation string then?

Yes, I was thinking the PQ KEM could create multiple new SHO objects,
and Absorb() a different domain-separation string into each of them,
e.g. FrodoKEM could absorb a 16-bit customization label at the start
of each SHO.

> When using SHA-2 you'd probably want to put the domain-separation string
> into a separate block, so the callee needs to know that some input is
> not a part of the message, but a domain separation. That would require
> making domain-separators a separate argument of the first call to Absorb
> or alternatively an argument of an Init function.

If you want a longer domain-separation string (e.g. including the name
"FrodoKEM" and other parameters) then you might choose to Ratchet()
after absorbing the domain-separation string(s).  This would enable
implementers to store the precalculated chaining variable, so they
could skip re-calculating the Absorb(domain_separator)+Ratchet
operations for every PQ operation.

If that's the only reason to put the domain-separation string in a
separate hash block, then I think Ratchet() supports it adequately?
(and for SHA2, Ratchet would just zero-pad the block, like you want).

> I'm not sure how you want to do domain separation in the Squeeze'
> function. This wouldn't really be (input) domain separation but some
> sort of output separation?

Yes, you're right.  There would be have to be some output-separator
field that gets encoded at the end of the hash input to distinguish
the normal Squeeze from the Squeeze' that happens as part of
Encrypt().

STROBE does this, using cSHAKE and adding an extra operation byte at
the end to distinguish STROBE operations like PRF from ENCRYPT.

You could also imagine a non-STROBE use of Keccak with a
non-SHA3/cSHAKE/SHAKE padding suffix, but I think the options are
dwindling, IIRC the current NIST-allocated suffixes are:

00 = cSHAKE
01 = SHA3
11 = SHAKE

> About instantiating Squeeze for SHA-2: Wouldn't the "standard" way to
> build a XOF from SHA-2 be to use MGF1 [1]?

I think MGF1 is just:
 HASH(input || uint32(0)) ||
 HASH(input || uint32(1)) ||
 HASH(input || uint32(2)) ||
 ...

That doesn't fix SHA2's length-extension problem, and isn't very
efficient if input is long, so for a SHA2 SHO I was suggesting
Absorb(s) followed by Squeeze would result in:

 HASH(HASH(input) || varint(0)) ||
 HASH(HASH(input) || varint(1)) ||
 HASH(HASH(input) || varint(2)) ||
 ...

https://moderncrypto.org/mail-archive/noise/2018/001876.html

The space-savings of the varint doesn't matter since the HASH(input)
and counter will fit into a single hash block however it's encoded, so
a uint64 might be a simpler choice?

---

Anyways, one dream here is to get PQ algorithms to adopt a SHO-type
API, so that you could use Noise with SHO/SHA256 or SHO/SHAKE128 and
the PQ algorithm would use the same hashing construct.

It would be a great exercise to work through some PQ algorithms and
see whether they could be adjusted to this API, are there any you'd
recommend trying this with? (e.g. some algorithms that take different
or unusual strategies for domain-separation?).

Trevor