[noise] Lightweight ciphers and Noise

Rhys Weatherley rhys.weatherley at gmail.com
Tue Nov 21 18:56:53 PST 2017


On Wed, Nov 22, 2017 at 12:10 PM, Trevor Perrin <trevp at trevp.net> wrote:

> > Sort of related, I've been doing some research and implementation
> testing on
> > light-weight ciphers for IoT environments as part of my Arduino crypto
> > library with an eventual goal of supporting Noise on extremely CPU and
> > memory-constrained devices.  In particular I've been looking into the
> Speck
> > and SKINNY block ciphers.  I will post my findings in a few days after
> > collecting some statistics.
>
> Interesting,  I think these are mostly designed for low security and
> small data volumes, but looks like they do have 256-bit key and
> 128-bit block variants.
>
> Once you've chosen these variants to be comparable to our existing
> symmetric crypto - and added authentication/MAC to create an "AEAD" -
> I wonder how performance compares to our existing crypto?
>

The raw figures for the algorithms on Arduino (AVR and ARM) can be found at
[1] and [2].

For raw performance in software, SPECK-128 rules all.  But that
algorithm was designed by the NSA so standards bodies have been hesitant to
adopt it.  Its companion algorithm SIMON is designed for hardware
implementations so I haven't bothered measuring its performance in software
yet.

SPECK with a 256-bit key and EAX mode for AEAD operation is faster than
ChaChaPoly but uses more RAM for permanent state.

There is an alternative way to implement encrypt-only SPECK that only needs
32 bytes of key schedule state - I call it SpeckTiny.  The full key
schedule is expanded on the fly during the encryption rounds.  Permanent
state for EAX<SpeckTiny> is about half of ChaChaPoly but CPU performance is
slower.

I've sent the last 2-3 weeks investigating the SKINNY family of block
ciphers [2].  As they were designed for hardware implementation, the
software versions are not as good as I was hoping.  To get comparable
performance to ChaCha20 or Speck requires moving to a 64-bit block size and
a 128-bit key.  Memory requirements are higher, but it should be possible
to implement "SkinnyTiny" with the key schedule expanded on the fly like
for SpeckTiny.  I haven't tested that yet.

SKINNY does have the advantage of being a tweakable block cipher - the
packet number n from CipherState would naturally turn into a tweak when
building an AEAD mode around SKINNY.

If your device is RAM-constrained, EAX<SpeckTiny> would be hard to beat.
If your device is RAM-rich but CPU-poor, then ChaChaPoly would work better.

We may want to have a separate discussion as to when it is acceptable to
use 64-bit block ciphers with Noise.  A lot of the research in lightweight
crypto is focused on that size block.  Since data volumes on small devices
isn't high, maybe 64-bit would be OK?  Converting a 256-bit key from the
Noise handshake into a 128-bit key is easy - XOR the two halves together.
EAX authentication tags would be 8 bytes in size rather than 16.  The rest
of Noise would remain identical.

Also - if code size / area is a concern, does something like STROBE /
> Disco start to become the better strategy, to eliminate the separate
> hash function?
>

In the low-end space, flash memory (code size) is usually pretty cheap, but
RAM (runtime state and stack) is not.  So I usually don't care about the
code size in my comparisons - the size of the crypto algorithms will be
small compared to whatever task the application upstairs is performing.

SHA3 is pretty RAM hungry - 400 bytes of permanent state and another 400
bytes of stack when the core block operation is evaluated.  Performance in
software implementations, even my ridiculously optimised assembly version
for AVR, is also not pretty.  I must admit though that I haven't studied
Strobe/Disco enough to make a fair comparison yet.

Stack space isn't that big of a deal - from my experiments Curve25519 needs
about 1k of stack space to evaluate the curve.  Once you have enough RAM to
pay that cost, the stack costs of the other algorithms don't matter much -
they can reuse the 1k that is free when Curve25519 finishes.  The hash
algorithms in Noise operate on the stack - there's no permanent state other
than ck and h, so the hash contexts can be stacked when needed and tossed
afterwards.


> (These are also young algorithms, which is a reason to be cautious).
>

Yes.

Cheers,

Rhys.

[1] http://rweather.github.io/arduinolibs/crypto.html
[2] https://rweather.github.io/skinny-c/skinny_arduino.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://moderncrypto.org/mail-archive/noise/attachments/20171122/dd2a7ba3/attachment.html>


More information about the Noise mailing list