Lightweight Cryptography Primitives

In early 2020 as I worked on this library I found some minor bugs in the second round reference implementations of some NIST algorithms. The authors were notified so that the problems could be corrected in the official reference implementations.
This page documents the problems that I found, and the fixes, for the benefit of other implementors. This library implements the fixed version of the algorithms.
In the proc_pt and proc_ct code, the first step is to XOR a domain separation value of 0x20 into the rolling Z state:
That will XOR the value into the first byte of the Z state. All other places in the code where domain separation is done XOR's the value into the last byte of the Z state. The above code should instead be:
This bug affects all four versions of COMET.
There was also a problem with carry propagation in the SPECK64128 implementation which caused the block cipher to generate different outputs from the standard version of SPECK64128. In the encryption phase, the following code appears:
The line that sets the next carry gives the wrong result if the incoming carry value was 1 and the bytes being added were both 0xFF. The following is the fixed version of "set next carry":
The specification says that if both the associated data and the plaintext are zerolength, then the algorithm should encrypt the nonce with tweak = 8 and then return immediately. The reference code was missing a return statement from the mac() function:
The effect is to completely ignore the conditional code and to start every MAC computation by encrypting the nonce with tweak = 1.
As part of the block operation, HYENA XOR's a 64bit delta value into the upper half of the 128bit feedback state:
That 15 should be a 16. There are two places in the reference code where this problem occurs, in Feedback_TXT_Enc() and Feedback_TXT_Dec().
The reference implementation of the PHOTON256 permutation in ORANGE has some byte order issues in store32() and load32(). The store32() function uses littleendian byte order 0 1 2 3, but the byte order of load32() is mixed up: 2 1 0 3 instead of 0 1 2 3.
This asymmetry means that the reference implementation does not implement PHOTON256 correctly. Correcting the byte order makes the ORANGE implementation of the permutation give the same results as the PHOTONBeetle implementation of the permutation.
In the specification of the underlying block cipher, the key schedule "MixAndRotateRows" step contains the following requirement:
The second row R1, third row R2, and fourth row R3 are leftrotated of 8, 15, 18 positions.
The reference implementation has a macro called "left_rotate" to perform this step but it actually performs a rightrotation by n bits instead of a leftrotation:
It is debatable as to whether this is a bug or simply a consequence of the bit ordering (left most cells of the row positioned in the least significant bits). From a security perspective, the direction of rotation probably doesn't matter so this issue isn't serious.
The test vectors in the specification were also clearly generated with a rightrotating implementation of the key schedule. This library implements the "bug" asis.
Section 2.2.6 Finalization of the SPIX specification says that the final tag should be extracted from bytes 7, ..., 15, 24, ..., 31 of the finalized state. That 7 should be an 8 instead. The reference implementation is correct.