ASCON Suite
|
The core of the ASCON Suite of algorithms is the ASCON permutation. The ascon_permute() function operates on a 320-bit (40-byte) input through a number of rounds, producing a 320-bit output. The number of rounds is configurable between 1 and 12.
Standard ASCON modes typically use 6, 8, or 12 rounds so there are convenient macros for those values: ascon_permute6(), ascon_permute8(), and ascon_permute12().
The permutation implementations for each platform are located in the src/core
directory. Each backend provides a "state and permutation" (SnP) interface to the rest of the library and user applications.
Most user applications have no need for the SnP API because they can use the relevant encryption or hashing mode from the library directly. The public SnP API is provided just in case an application has a specific requirement that isn't met by the standard modes. This page describes how to use the permutation directly should the application need to do so.
The permutation API is declared in the header permutation.h, included into application code as follows:
The 320-bit input (and output) to the permutation is held in a structure called ascon_state_t. The state must be initialized before it can be used by the application:
Initialization sets all of the bits of the ASCON state to zero.
Once the application is finished with the state, it must finalize it with ascon_free():
Finalization will attempt to destroy any sensitive material still in the "state" structure. But the compiler's optimizer may interfere with the execution order of the program and defeat the attempt. Or there may still be fragments of the sensitive material elsewhere on the stack or in registers. All we can do is a "best effort" to clean up the state given the system's constraints.
For buffers in your own code, use ascon_clean() to destroy sensitive material when you no longer need it. The same caveats about the compiler apply.
Modes that use ASCON typically start by absorbing an initialization vector value into the state and then calling the permutation. The following is the initialization sequence for the ASCON-HASHA algorithm:
We can use either ascon_add_bytes() or ascon_overwrite_bytes() to absorb the initialization vector; ascon_overwrite_bytes() is slightly faster for this use case because we know that ascon_init() set the state to all-zeroes.
ASCON-HASHA absorbs the incoming data in blocks of 8 bytes in length, invoking the permutation for 8 rounds between each block:
The last partial input block is padded with a 0x80 byte and then the permutation is run for a full 12 rounds:
PRNG algorithms based on a sponge function protect themselves from being run backwards by extracting some data and XOR'ing it back into the state to zero out those bytes. The state is protected from being run backwards because an attacker would need to guess the bytes before they were zeroed:
A more efficient method that doesn't need a temporary buffer is to use the function ascon_overwrite_with_zeroes() instead:
ASCON-HASHA extracts 32 bytes from the permutation state for the hash value, 8 bytes at a time. This is done with the ascon_extract_bytes() function:
Full source code for the hashing example
If we were implementing an encryption mode, we would start by absorbing the 16-byte key and the 16-byte nonce after the initialization vector, but before the permutation call:
There are many ways to encrypt using a permutation like ASCON. The library contains several for AEAD modes. We will explore two methods, loosely based on the Output FeedBack (OFB) and Cipher FeedBack (CFB) modes for block ciphers.
In OFB mode, the permutation is iterated and extracted output is XOR'ed with the plaintext m to produce the ciphertext c:
The ascon_extract_and_add_bytes() function extracts bytes from the permutation state and XOR's them with the input buffer in its second argument to produce the bytes in the output buffer in its third argument. The permutation state itself is unmodified.
The OFB encryption function can also be used for decryption by swapping the m and c arguments. It is its own inverse.
In CFB mode, we incorporate the plaintext into the state. Future blocks depend not only on the permutation state but also all of the plaintext to date. This is a common construction for ciphers when we want authenticate the plaintext in addition to encrypting it:
The ascon_add_bytes() call XOR's the plaintext block with the state to encrypt it and incorporate it into the state. The ascon_extract_bytes() call pulls out the encrypted data.
CFB decryption has a slightly different structure to encryption:
The ascon_extract_and_add_bytes() call extracts some of the state and XOR's it with the ciphertext block to decrypt it. The original ciphertext block is then incorporated into the state for the next step.
There is a problem with this code however. This will not work if m and c are the same buffer for in-place decryption. We could save the ciphertext block in a temporary location before the call to ascon_extract_and_add_bytes(). A more efficient method is to use ascon_extract_and_overwrite_bytes():
If we were building an authenticated cipher based around CFB mode, we would next extract some bytes from the permutation state to act as the authentication tag:
Full source code for the CFB encryption example
Normally you initialize the ASCON state, perform the add or extract operations you need, and then free the state in a single sequence. Combined high-level functions in the API like ascon_hash() and ascon128_aead_encrypt() do this.
For incremental API's, there may be an unknown amount of time between uses of the permutation. Incremental functions like ascon_hash_update() and ascon_xof_squeeze() do this.
If the backend is using a shared accelerated hardware module then you don't want to keep it tied up indefinitely. Other tasks on the system may want to use it.
The appropriate thing to do is to call ascon_release() to return the hardware module to the system, and then call ascon_acquire() when you want to resume using ASCON:
The ASCON state is implicitly acquired when ascon_init() is called and implicitly released when ascon_free() is called. The ascon_release() and ascon_acquire() functions are not recursive and do not "count" how many acquisitions are in effect by the current task. The behaviour is undefined if they are called in the wrong order.
Incremental API's in the library release and acquire the state automatically so you don't have to do anything if you are using the library's modes.
Sometimes you may want to copy the entire state of the permutation to a new object. The ISAP mode in the library makes use of this. The ISAP key state is precomputed ahead of time. When a message is encrypted, the per-message state is initialized by copying the precomputed state.
It is important that the "precomputed" state be released. The copy process will create a brand new state in "copy" and acquire it.
The current backends store the 320 bits of the ASCON state directly in the ascon_state_t structure. But this may not be the case in the future. Accelerated backends may need more memory than 320 bits, or need handles to access secure hardware modules. Thus, some versions of ascon_init() may allocate memory from the heap and store a pointer into the ascon_state_t structure. The memory is freed when ascon_free() is called.
It is very important that the contents of ascon_state_t be treated as opaque. You may be tempted to directly XOR your data with the state, but it won't work. Even with the current backends, the 320 bits are not guaranteed to be stored in the canonical ASCON bit order.
For efficiency reasons, the backend is allowed to store the bits of the state in any order it wishes. Some use the canonical order (e.g. avr5), others use little-endian 64-bit words (e.g. x86-64 and armv8a), and still others use a 32-bit bit-sliced representation where the even and odd bits of the state are split into separate words (e.g. armv6, armv7m, i386, m68k, etc).
There is no method outside of the library to detect which bit order is being used. Applications should use the "add", "overwrite", and "extract" functions in the SnP API to manipulate the permutation state. This design choice simplifies the public permutation API and the applications that use it.
The modes that are implemented inside the library are aware of which backend is in use and can optimize "add", "overwrite", and "extract" operations accordingly. This is why it is usually better to use the library's modes than invent your own.
Use ascon_extract_bytes() with a 40-byte buffer if you want to extract the entire ASCON state in the canonical bit order. And use ascon_overwrite_bytes() to populate the entire ASCON state from a 40-byte buffer in the canonical bit order. The following example permutes a test vector with 6 rounds of the permutation: