[2844] in cryptography@c2.net mail archive
RE: "damn the bitmaps..."
daemon@ATHENA.MIT.EDU (Colin Plumb)
Wed Jun 24 20:18:08 1998
Date: Wed, 24 Jun 1998 18:15:30 -0600 (MDT)
From: Colin Plumb <colin@nyx.net>
To: cryptography@c2.net, jya@pipeline.com
As for speed, I'm currently getting 1.36 us/byte (encrypting the same block
over and over to make 10 MB) on a (non-MMX) Pentium 133 with C code.
You can *almost* do a good assembly implementation using the 8 byte
registers on an x86, leaving si di and bp available for indexing,
pointers, and so on, but you need a 9th byte to hold the g2 ^ *key++
part of g1 ^= F[g2 ^ *key++] in the inner loop.
Perhaps some work on key scheduling to move the location of the key XOR
around would work.
The algorithm is *almost* self-inverse, with just a reversed key schedule.
I wonder how the hardware does the switch?
It seems easiest to me to swap the input bytes around (to achieve the effect
of swapping the halves in the G operation) and then just change which byte
the counter gets added to.
--
-Colin