Prøst was a contestant in the CAESAR competition for Authenticated Encryption. I optimised Prøst for the ARM11 microprocessor architecture. By trying to find a provably minimal program for one of the sub-operations, I found a new approach to implementing MixSlices, one of the sub-operations in Prøst’s permute function. This new implementation has 33% fewer arithmetic operations than the original version. Using this result and by implementing Prøst in assembly and applying micro-optimisations, a performance gain of 28% to 48% was achieved.
See the project page for more information.