Lightweight Cryptography Primitives
 All Data Structures Files Functions Variables Typedefs Macros Pages
Performance on 32-bit platforms

Table of Contents

Note
The 10 finalists of the NIST lightweight cryptography competition were announced in March 2021. I have forked this repository to create a new repository for the finalists and further improvements. This repository is now archived. New performance figures for the finalists can be found here.

NIST set a cut-off of 18 Septeber 2020 for status updates from the Round 2 candidate submission teams, leading up to the selection of Round 2 finalists in December 2020. All of my implementations prior to that date were in C.

Since that date, some newer implementations have been contributed by others and written by myself. The performance of the newer versions compared with the baseline versions can be found on the Phase 2 Performance Figures page. The tables on this page have been updated to reflect the latest figures.

The figures for the original baseline versions are now found on a separate page.

Introduction

There is a lot of variation in the capabilities of embedded microprocessors. Some are superscalar; others are not. Some have specialised vector instructions; others do not. Clock speeds can also vary considerably. All this means that "cycles per byte" or "megabytes per second" are pretty meaningless when trying to rank the algorithms on relative performance on any given microprocessor.

The approach I take here is "ChaChaPoly Units". The library contains a reasonably efficient 32-bit non-vectorized implementation of the ChaChaPoly AEAD scheme from my Arduino cryptography library. This makes it a known quanitity to compare with other algorithms side by side.

If an algorithm is measured at 0.8 ChaChaPoly Units on a specific embedded microprocessor at a specific clock speed, then that means that it is slower than ChaChaPoly by a factor of 0.8 on that microprocessor. If the algorithm is instead measured at 2 ChaChaPoly Units, then it is twice as fast as ChaChaPoly on the same microprocessor. The higher the number of units, the better the algorithm.

The number of ChaChaPoly Units for each algorithm will vary for each microprocessor that is tested and for different choices of optimisation options. The figures below should be used as a rough guide to the relative performance of the algorithms, not an absolute measurement.

For hash algorithms we use BLAKE2s as the basic unit. BLAKE2s is based on ChaCha20 so it is the most logical hashing counterpart to ChaChaPoly.

This page details the performance results for 32-bit platforms. A separate page that details preliminary results for the 8-bit AVR platform can be found here.

The masking performance page contains comparisons of masked versions of the algorithms with their baseline versions.

Performance on ARM Cortex M3

All tests were run on an Arduino Due which is an ARM Cortex M3 running at 84MHz. The code was optimised for size rather than speed, which is the default optimisation option for the Arduino IDE. I found that "-Os" size optimisation often did better on the Due than "-O2" or "-O3" with the compiler that I had. Your own results may vary.

Each algorithm was tested with two packet sizes: 128 and 16 bytes. Some algorithms can have better performance on small packet sizes. The associated data is always zero-length.

The value in the table below indicates the number of times faster than ChaChaPoly on the same packet. Higher numbers mean better performance. The table is ordered from best average performance down.

Where a NIST submission contains multiple algorithms in a family, bold italics indicates the primary algorithm in the family.

An asterisk (*) indicates algorithms that have been accelerated with assembly code.

AlgorithmKey BitsNonce BitsTag BitsEncrypt 128 bytesDecrypt 128 bytesEncrypt 16 bytesDecrypt 16 bytesAverage
COMET-128_CHAM-128/1282 (*)1281281281.571.562.912.692.05
Schwaemm128-128 (SPARKLE) (*)1281281281.601.582.842.392.01
COMET-64_SPECK-64/128 (*)1281201281.421.432.862.751.94
Schwaemm256-128 (SPARKLE) (*)1282561281.741.631.901.931.80
ASCON-128a (*)1281281281.861.701.801.781.78
SATURNIN-Short12561282561.821.661.73
Schwaemm192-192 (SPARKLE) (*)1921921921.471.501.981.811.68
Xoodyak (*)1281281281.661.511.731.601.62
ASCON-128 (*)1281281281.541.441.781.681.61
ASCON-80pq (*)1601281281.521.431.711.651.57
TinyJAMBU-128 (*)12896640.930.951.631.611.21
GIMLI-24 (*)2561281281.081.091.291.281.18
Schwaemm256-256 (SPARKLE) (*)2562562561.181.161.151.091.14
GIFT-COFB (*)1281281281.011.011.161.151.08
TinyJAMBU-192 (*)19296640.810.841.451.441.08
COMET-64_CHAM-64/128 (*)1281201280.700.751.351.370.97
TinyJAMBU-256 (*)25696640.700.731.281.290.94
Spook-128-384-su1281281280.780.791.111.090.93
Spook-128-384-mu1281281280.780.791.101.090.93
Spook-128-512-su2561281280.920.930.880.890.90
Spook-128-512-mu2561281280.920.930.880.880.90
SpoC-1281281281280.590.621.141.140.82
HYENA (*)128961280.680.740.870.880.80
DryGASCON128k16 (*)1281281280.590.621.031.020.78
SUNDAE-GIFT-0 (*)12801280.570.611.041.050.78
Pyjamask-96-AEAD12864960.660.670.810.830.74
ESTATE_TweGIFT-1282 (*)1281281280.530.571.041.040.74
Pyjamask-128-AEAD128961280.670.630.800.790.72
SUNDAE-GIFT-64 (*)128641280.540.580.840.860.69
SUNDAE-GIFT-96 (*)128961280.540.580.830.850.69
SUNDAE-GIFT-128 (*)1281281280.540.580.810.830.68
SATURNIN-CTR-Cascade2561282560.390.420.420.440.42
SPIX1281281280.410.440.380.390.40
LOTUS-AEAD128128640.290.310.560.580.40
LOCUS-AEAD128128640.280.290.560.570.39
KNOT-AEAD-128-2561281281280.290.310.470.490.38
Grain-128AEAD12896640.260.260.560.560.37
KNOT-AEAD-128-3841281281280.310.330.300.320.31
SpoC-64128128640.220.240.420.440.31
SKINNY-AEAD-M612896640.190.200.330.340.25
SKINNY-AEAD-M5128961280.190.200.330.340.25
Romulus-N3128961280.190.200.300.310.24
ACE1281281280.200.220.230.240.22
SKINNY-AEAD-M412896640.160.170.260.270.21
SKINNY-AEAD-M3128128640.160.170.260.270.21
SKINNY-AEAD-M2128961280.160.170.260.270.21
SKINNY-AEAD-M11281281280.160.170.260.270.21
Romulus-N2128961280.150.170.230.240.19
Romulus-N11281281280.150.170.210.220.19
ISAP-A-128A (*)1281281280.240.260.130.140.18
KNOT-AEAD-192-3841921921920.150.170.210.220.18
DryGASCON256k322561282560.130.140.190.200.16
Oribatida-256-641281281280.120.130.220.230.16
Romulus-M3128961280.120.130.200.220.16
Subterranean1281281280.160.180.130.140.15
PAEF-ForkSkinny-128-2561281121280.100.090.320.280.15
PAEF-ForkSkinny-128-192128481280.100.090.320.280.15
SAEF-ForkSkinny-128-2561281201280.100.090.320.280.15
SAEF-ForkSkinny-128-192128561280.100.090.320.280.15
Romulus-M2128961280.100.110.170.180.14
Romulus-M11281281280.100.110.150.160.13
KNOT-AEAD-256-5122562562560.100.120.120.130.12
ORANGE-Zest1281281280.110.120.110.120.11
PAEF-ForkSkinny-128-2881281041280.070.060.240.200.11
Oribatida-192-9612864960.070.080.120.130.10
PHOTON-Beetle-AEAD-ENC-1281281281280.060.070.110.120.08
PAEF-ForkSkinny-64-19212848640.050.050.180.140.08
ISAP-A-128 (*)1281281280.080.080.030.040.05
Delirium (Elephant)128961280.040.050.060.070.05
WAGE1281281280.030.030.030.030.03
PHOTON-Beetle-AEAD-ENC-321281281280.020.020.050.050.03
ISAP-K-128A1281281280.020.020.010.010.02
Dumbo (Elephant)12896640.010.020.030.030.02
Jumbo (Elephant)12896640.010.020.020.020.02
ISAP-K-1281281281280.00340.00380.00150.00160.0021

Note 1. SATURNIN-Short is limited to no more than 15 bytes of payload, so there are no performance figures for 128-byte packets, and the 16-byte columns report the results for 15 bytes of payload instead.

Note 2. COMET-128_CHAM-128/128 and ESTATE_TweGIFT-128 are not the primary members from the algorithm authors. Instead, the authors recommend AES-based versions of COMET and ESTATE, which are not implemented in this libary.

The hash algorithms are compared against BLAKE2s instead of ChaChaPoly:

AlgorithmHash Bits1024 bytes128 bytes16 bytesAverage
Esch256 (SPARKLE) (*)2560.890.781.501.06
Xoodyak (*)2560.710.651.430.93
GIMLI-24-HASH (*)2560.540.470.860.62
ASCON-HASH (*)2560.510.410.630.52
DryGASCON128-HASH (*)2560.290.290.880.48
Esch384 (SPARKLE) (*)3840.450.371.500.47
SATURNIN-Hash2560.280.230.570.36
ACE-HASH2560.100.090.150.11
DryGASCON256-HASH5120.060.050.110.08
KNOT-HASH-256-3842560.050.040.070.05
KNOT-HASH-256-2562560.030.030.080.04
Subterranean-Hash2560.020.020.050.03
ORANGISH2560.020.020.030.02
KNOT-HASH-384-3843840.010.010.040.02
PHOTON-Beetle-HASH2560.010.010.050.02
SKINNY-tk3-HASH2560.020.010.020.02
KNOT-HASH-512-5125120.010.010.020.01
SKINNY-tk2-HASH2560.010.010.020.01

Performance on ESP32

The tests below were run on an ESP32 Dev Module running at 240MHz. The ordering is mostly the same as ARM Cortext M3 with a few reversals where the architectural differences gives some algorithms an added advantage.

AlgorithmKey BitsNonce BitsTag BitsEncrypt 128 bytesDecrypt 128 bytesEncrypt 16 bytesDecrypt 16 bytesAverage
SATURNIN-Short2561282561.621.631.62
COMET-128_CHAM-128/1281281281281.171.111.881.731.43
COMET-64_SPECK-64/1281281201281.021.022.041.941.39
Schwaemm128-128 (SPARKLE)1281281281.071.061.681.601.32
Schwaemm256-128 (SPARKLE)1282561281.111.091.041.041.06
Schwaemm192-192 (SPARKLE)1921921920.870.901.021.000.95
ASCON-128a1281281280.860.880.920.930.90
GIFT-COFB1281281280.800.830.900.900.86
Xoodyak1281281280.840.860.800.820.83
TinyJAMBU-12812896640.600.641.161.150.83
TinyJAMBU-19219296640.540.581.061.060.75
GIMLI-242561281280.650.690.790.810.74
Schwaemm256-256 (SPARKLE)2562562560.770.780.700.700.73
Spook-128-384-su1281281280.580.620.800.810.70
Spook-128-384-mu1281281280.580.620.800.800.70
Spook-128-512-su2561281280.710.740.630.650.68
Spook-128-512-mu2561281280.710.740.630.650.67
TinyJAMBU-25625696640.470.510.940.950.67
HYENA128961280.550.600.700.720.64
SUNDAE-GIFT-012801280.470.520.850.870.64
ASCON-1281281281280.670.460.860.660.63
ASCON-80pq1601281280.670.440.840.610.61
COMET-64_CHAM-64/1281281201280.420.460.820.840.59
KNOT-AEAD-128-3841281281280.570.610.570.590.59
KNOT-AEAD-128-2561281281280.450.490.730.750.59
SUNDAE-GIFT-64128641280.440.490.690.710.57
SUNDAE-GIFT-96128961280.440.490.680.700.57
SUNDAE-GIFT-1281281281280.440.490.670.700.57
SpoC-1281281281280.400.440.770.780.56
ESTATE_TweGIFT-1281281281280.380.420.740.760.54
Pyjamask-96-AEAD12864960.460.490.540.560.51
Pyjamask-128-AEAD128961280.440.460.510.520.48
Grain-128AEAD12896640.330.300.650.590.43
SATURNIN-CTR-Cascade2561282560.340.370.380.390.37
DryGASCON128k161281281280.230.250.400.420.31
SpoC-64128128640.220.250.410.430.31
LOTUS-AEAD128128640.200.220.400.420.29
LOCUS-AEAD128128640.190.220.390.410.28
SPIX1281281280.270.300.240.260.26
KNOT-AEAD-192-3841921921920.280.320.390.410.25
Oribatida-256-641281281280.170.190.340.350.25
Oribatida-192-9612864960.170.190.280.300.22
KNOT-AEAD-256-5122562562560.180.210.210.230.21
DryGASCON256k322561282560.160.180.250.270.21
ACE1281281280.130.150.150.160.15
SKINNY-AEAD-M612896640.110.120.190.200.15
SKINNY-AEAD-M5128961280.110.120.190.200.15
Romulus-N3128961280.100.120.170.190.14
SKINNY-AEAD-M412896640.090.100.150.160.12
SKINNY-AEAD-M3128128640.090.100.140.160.12
SKINNY-AEAD-M2128961280.090.100.140.150.12
SKINNY-AEAD-M11281281280.090.100.140.150.12
Subterranean1281281280.120.140.100.110.12
Romulus-N2128961280.091.100.130.140.11
Romulus-N11281281280.090.100.120.130.11
ISAP-A-128A1281281280.130.150.080.090.10
ORANGE-Zest1281281280.090.110.100.110.10
PAEF-ForkSkinny-128-2561281121280.060.070.210.200.10
SAEF-ForkSkinny-128-2561281201280.060.060.210.200.10
PAEF-ForkSkinny-128-192128481280.060.070.210.200.10
SAEF-ForkSkinny-128-192128561280.060.060.210.200.10
Romulus-M3128961280.070.080.120.130.09
Romulus-M2128961280.060.070.090.100.08
Romulus-M11281281280.060.070.090.100.08
PHOTON-Beetle-AEAD-ENC-1281281281280.050.060.100.110.08
PAEF-ForkSkinny-128-2881281041280.050.050.160.140.08
PAEF-ForkSkinny-64-19212848640.040.040.140.120.07
Delirium (Elephant)128961280.050.060.070.080.06
WAGE1281281280.030.040.040.040.04
ISAP-K-128A1281281280.030.030.020.020.02
ISAP-A-1281281281280.030.030.010.020.02
PHOTON-Beetle-AEAD-ENC-321281281280.010.020.040.040.02
Dumbo (Elephant)12896640.010.010.020.020.02
Jumbo (Elephant)12896640.010.010.010.020.01
ISAP-K-1281281281280.00400.00470.00180.00200.0025

Hash algorithms:

AlgorithmHash Bits1024 bytes128 bytes16 bytesAverage
Xoodyak2560.350.330.730.47
Esch256 (SPARKLE)2560.380.340.640.45
GIMLI-24-HASH2560.350.290.500.38
SATURNIN-Hash2560.230.190.480.30
Esch384 (SPARKLE)3840.240.200.300.25
ASCON-HASH2560.190.160.250.20
DryGASCON128-HASH2560.100.100.340.18
KNOT-HASH-256-3842560.090.070.130.10
DryGASCON256-HASH5120.080.070.150.10
ACE-HASH2560.070.060.100.07
KNOT-HASH-256-2562560.040.040.130.07
KNOT-HASH-384-3843840.030.030.070.04
SKINNY-tk3-HASH2560.070.010.010.03
SKINNY-tk2-HASH2560.050.030.010.03
ORANGISH2560.020.020.030.02
KNOT-HASH-512-5125120.020.020.040.02
PHOTON-Beetle-HASH2560.010.010.050.02
Subterranean-Hash2560.010.010.040.02

Overall group rankings

Based on the above data, the NIST submissions can be roughly grouped with those of similar performance. Changes in CPU, optimisation options, loop unrolling, or assembly code replacement might modify the rank of an algorithm.

Only the primary algorithm in each family is considered for this ranking. I took the average of the ARM Cortex M3 and ESP32 figures from the above tables to compute an average across different architectures. I then grouped the algorithms into 0.1-wide buckets; for example everything with rank 3 has an average between 0.30 and 0.39 ChaChaPoly units.

AEAD algorithm rankings:

RankAlgorithms
17COMET
14SPARKLE
12Xoodyak
11ASCON
10TinyJAMBU
9GIFT-COFB, Gimli
7HYENA, Spook
6ESTATE, Pyjamask, SUNDAE-GIFT
5DryGASCON
4Grain128-AEAD, KNOT
3LOTUS, Saturnin, SPIX, SpoC
2Oribatida
1ACE, ORANGE, Romulus, SKINNY-AEAD, Subterranean
0Elephant, ForkAE, ISAP, PHOTON-Beetle, WAGE

Hash algorithm rankings:

RankAlgorithms
7SPARKLE, Xoodyak
5Gimli
3ASCON, DryGASCON, Saturnin
0ACE, KNOT, ORANGE, PHOTON-Beetle, SKINNY-AEAD, Subterranean