Armv6-M CPUs (e.g. Arm Cortex M0+) have 32-bit multiplications but only return the low 32 bits of the result; a software routine must be used to compute the high 32 bits. The one that comes with the C compiler is usually bad (e.g. the one from GCC has cost 61 cycles and is not constant-time). I made a better one (constant-time, 24 cycles for a full 64x64->64 multiplication, 20 cycles for the 32x32->64 variants).
https://github.com/pornin/armv6m-longmul
pornin@infosec.exchange
@pornin@infosec.exchange
Posts
-
Armv6-M CPUs (e.g. Arm Cortex M0+) have 32-bit multiplications but only return the low 32 bits of the result; a software routine must be used to compute the high 32 bits.