[noise] ChaCha speed update

Tony Arcieri bascule at gmail.com
Tue Jul 25 22:34:20 PDT 2017

On Tue, Jul 25, 2017 at 3:56 PM, D. J. Bernstein <djb at cr.yp.to> wrote:

> A year ago I wrote:
> > Concretely, several generations of Intel chips have run 12-round
> > ChaCha12-256 at practically the same speed as 12-round AES-192 (with a
> > similar security margin), even though AES-192 has "hardware support", a
> > smaller key, a smaller block size, and smaller data limits. For example:
> >
> >    * Both ciphers are ~1.7 cycles/byte on Westmere (introduced 2010).
> >    * Both ciphers are ~1.5 cycles/byte on Ivy Bridge (introduced 2012).
> >    * Both ciphers are ~0.8 cycles/byte on Skylake (introduced 2015).
> >
> > ChaCha20-256 is slower than ChaCha12-256 but this is entirely because it
> > has a much larger security margin. For reasons explained below, I
> > wouldn't be surprised to see ChaCha20-256 running _faster_ than AES-256
> > on future Intel chips.
> Romain Dolbeau has now submitted benchmarks for an Intel Skylake with
> AVX-512:
>    * 0.48 cycles/byte: 12-round ChaCha12-256.
>    * 0.48 cycles/byte: 12-round Salsa20/12-256.
>    * 0.69 cycles/byte: 20-round ChaCha20-256.
>    * 0.69 cycles/byte: 20-round Salsa20-256.
>    * 0.87 cycles/byte: 14-round AES-256.
> https://bench.cr.yp.to/results-stream.html#amd64-manny1024
> Thanks to Intel for giving up on AES and joining the monoculture! :-)
> The code is C code from Dolbeau, using _mm512_rol_epi32() etc.

I've been very curious about these numbers, as ChaCha is an algorithm that
seems to fit AVX-512 like a glove.

That said, I know a lot of people who have operational problems with
AVX-heavy workloads, particularly those who are trying to run a mixed
AVX-heavy workload which also has various CPU-heavy things.

I've also gotten a lot of reports from people (but have not personally
experienced) thermal throttling on AVX heavy workloads.

While these numbers seem very impressive, I can only assume the power/heat
required to pull them off is substantially higher than the AES workload.

All that said, I'm very curious how this sort of AVX-heavy workload would
interact with your typical messy "business logic"-heavy program as compared
to an AES-NI workload.

> Meanwhile there's a burst of papers this month from people struggling
> with the security limitations of AES:
>    https://eprint.iacr.org/2017/697
>    https://eprint.iacr.org/2017/702
>    https://eprint.iacr.org/2017/708

Unless I'm missing something, these papers are all about AES-GCM-SIV and
the security limitations that arise from GCM, not AES...

Tony Arcieri
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://moderncrypto.org/mail-archive/noise/attachments/20170725/d270a3dc/attachment.html>

More information about the Noise mailing list