[curves] Distribution-ready optimized code

Watson Ladd watsonbladd at gmail.com
Thu Mar 19 10:03:50 PDT 2015


On Mar 18, 2015 10:54 PM, "Samuel Neves" <sneves at dei.uc.pt> wrote:
>
> Suppose you have some amazing new CPU-specific code for your favorite
field, curve, key exchange, or whatever. How do
> you distribute it in a way that minimizes its user's effort to integrate
it in their own applications (presumably in C
> or via some FFI interface)?
>
> As I see it, there are 4 possible approaches:
>
> 1. Distribute the assembly. This is the obvious reply, and arguably the
best. Nevertheless, this option leaves something
> to be desired:
>   - ABIs / calling conventions vary between operating systems and/or
languages, e.g., SysV ABI vs Windows ABI, . This
> requires either preprocessor usage or some sort of trampoline (e.g.,
https://github.com/floodyberry/asm-opt) to adjust
> parameters to the implemented convention.
>  - Syntaxes also vary, e.g., Intel vs AT&T x86 syntax, Plan9 assembler
syntax, etc. This either requires a single
> assembler that works with all syntaxes, or distributing multiple versions
of the same function.
>
> 2. Heavy preprocessor use / code generator. This is the OpenSSL approach,
using Perl scripts to output suitable assembly
> for the relevant platform. Crypto++ does something similar, but abuses
the C preprocessor for this instead. This
> approach is not too bad, but it easily makes the code unreadable when
supporting multiple instruction sets, platforms,
> or other optionals. And may require fluency in some otherwise unnecessary
language.
>
> 3. Use compiler intrinsics. This is not always practical, since some
instructions do not have suitable compiler
> intrinsics to take advantage of. When it is, however, it is still
problematic for anything more than prototyping:
> performance is wildly dependent on the compiler, version, and switches
used. In some cases the compiler does not even
> support the intrinsics. This is OK when the user can control these, but
that is not always the case.
>
> 4. Use a "smart" assembler. This is an assembler that is slightly higher
level, and acts as a middle-ground between 1-2
> and 3. Besides automatic register allocation, such tools may also easily
accommodate things like syntax and ABI if
> necessary. Examples of what I'm thinking here are qhasm (
http://cr.yp.to/qhasm.html) or PeachPy
> (https://bitbucket.org/MDukhan/peachpy). I like this approach, but the
current tools are prototypes at best, and
> therefore are not exactly suitable for distribution in their current
state.
>
> So what do you guys think? Are there other options I failed to list here?
Which do you like best?

What about mixed 1 and 4? Distribute asm a tool made.
>
> Best regards,
> Samuel Neves
>
> _______________________________________________
> Curves mailing list
> Curves at moderncrypto.org
> https://moderncrypto.org/mailman/listinfo/curves
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://moderncrypto.org/mail-archive/curves/attachments/20150319/2ee24450/attachment.html>


More information about the Curves mailing list