[noise] New branch: hkdf

Trevor Perrin trevp at trevp.net
Sat Oct 10 12:21:09 PDT 2015


More on the GETKEY-based key derivation (n0 branch) versus the HKDF
version (hkdf branch):

 * Simplicity:
   * HKDF adds a new variable (ck).
   * The n0 branch conditionalizes the n=0 case, and requires
implementing MixKey() using a pair of functions (GETKEY and HMAC)
instead of a single function (HKDF), so complexity of all this versus
the extra variable is probably a wash.
   * The descriptions of how to optimize GETKEY() by calling the
cipher directly are ugly; it's also unintuitive to call an encryption
function as part of the key derivation.

 * Efficiency:  Looking at the XX handshake, n0 would take 3 HMAC and
4 GETKEY, HKDF would take 12 HMAC.  So n0 is potentially faster, but
some (most?) AES-GCM implementations will build lookup tables for AES
and GCM for each new key.  In that case n0 is probably slower.

But assuming well-optimized code:  If the HKDF version is, say, 8-9
HMACs slower, then it's hashing an extra ~2000 bytes (with SHA256).
To put very rough numbers on it, that might be 20K Haswell cycles [1].
Compared to 3 ECDHs + 1 keygen at ~500K cycles [2], this is < 5% speed
difference for the handshake.

The main reason HKDF is less efficient is that after HMAC'ing the next
DH, it performs another HMAC to derive ck from k, instead of using k
directly.  But this extra derivation has some minor advantages:

 * Flexibility:   Deriving k from ck lets their sizes differ - for
example, the HKDF version could be easily be adapted to HMAC-SHA512
and 64-byte ck, or AES-128 and 16-byte k.

 * Resilience in case k is leaked:  Even if the value of k was somehow
leaked (side-channels? cryptanalysis?) that wouldn't directly reveal
the chaining variable ck, with HKDF.  (But this is a very minor
argument, because if the cipher key is being leaked, then transport
messages encrypted with the cipher probably won't be secure either.)

 * Modularity with AEAD algorithm:  The GETKEY design requires the
first 32 bytes of encrypting zeros to be a PRF from the cipher key -
but there's no guarantee that a general AEAD will satisfy this, so we
have to add extra stipulations (the auth tag must come at the end, the
AEAD has to be deterministic so it doesn't add an IV at the beginning,
and it must not expand the ciphertext beyond the auth tag, e.g. add
zero-padding at the beginning or something).  While these stipulations
are probably sufficient (?), and probably achieved by most
deterministic AEAD modes, this means extra care in choosing symmetric
crypto parameters: n0 can't be used with *any* arbitrary AEAD mode
because it wants to break the abstraction and peek into the
ciphertext.

 * Random Oracle security analysis: If the hash function is modelled
as a random oracle, there's no significant difference, I think.  The
GETKEY version requires an additional assumption that GETKEY is a PRF,
so the analysis is a tiny bit more complicated.

 * Familiarity:  HKDF is familiar from IPsec, QUIC, TLS 1.3, and
Hugo's papers and RFC.  Using chained HKDFs in this way is also
familiar from TextSecure's "Axolotl" ratcheting.

---

In sum: I think the HKDF design is a little easier to explain,
analyze, implement, and justify.  It's a few percent slower when
well-optimized, but likely faster in naive implementations when using
ciphers with low key agility (e.g. AES-GCM).

HKDF also provides a more modular design where the cipher keys and
HKDF keys are distinct, which gives more flexibility to adapt the
design to different key sizes, or different AEAD algorithms.

Based on all this, I favor the HKDF design at the moment.  I'll
probably merge it to master in a few days if there aren't any
compelling counter-arguments.


Trevor


[1]
https://www-ssl.intel.com/content/dam/www/public/us/en/documents/white-papers/haswell-cryptographic-performance-paper.pdf

[2]
https://docs.google.com/spreadsheets/d/1SO3NGX-EgIZ1slw9uExb5FoeFy5TVkuA2lEutP6roYI/edit#gid=0


More information about the Noise mailing list