[2426] in cryptography@c2.net mail archive

home help back first fref pref prev next nref lref last post

Biham's vectorized DES

daemon@ATHENA.MIT.EDU (Phil Karn)
Tue Mar 31 14:32:51 1998

Date: Tue, 31 Mar 1998 11:17:43 -0800 (PST)
From: Phil Karn <karn@maggie.ka9q.AMPR.ORG>
To: cryptography@c2.net
Cc: karn@qualcomm.com
Reply-To: karn@qualcomm.com

Last night yet I found another fascinating Biham paper on his web page:

"A Fast New DES Implementation in Software"

http://www.cs.technion.ac.il/users/biham/cgi-binw/tr-get.cgi/1997/CS/CS0891.ps.gz

Eli took the DES and expressed it as a sequence of basic logic
operations: AND, OR, XOR and NOT. This required expressing the S-boxes
as logic gates, which wasn't too hard because the DES was originally
designed in the 1970s for just this kind of implementation.  Then
using the Alpha as if it were a SIMD vector processor with 64 1-bit
processors, he encrypts 64 blocks in parallel.

The extra cost of doing the S-boxes as logic steps instead of a table
lookup was more than offset by the 64-ary parallelism. Also,
permutations and expansions become completely free -- they're just
register renamings.  The overall speed per encryption was about 5x
that of his own "conventional" DES implementation tuned for 64-bit
registers.

While this approach has its limits (it can't be used for a single CBC
encryption stream because of the cipher chaining) it can be used for
CBC decryption. It will work for ECB. And it will work very for key
searching. The keys don't have to be the same for each of the 64
encryptions, and the known plaintext-ciphertext pairs can be converted
just once into the non-standard internal representation.

When I read this paper I immediately realized that this is the way to
take advantage of the Intel MMX instruction set. MMX provides eight
64-bit registers, and there are 64-bit AND, OR, XOR and NOT
instructions.  Even without S-boxes there will be memory references
(because there aren't 64 MMX registers) but they should all be to the
cache.

Has anybody looked at this? When I recently asked about the utility
of MMX for crypto, nobody mentioned this trick so I figured nobody
had seen Eli's paper. Even though he did it on the DEC Alpha, the
application to MMX is obvious and immediate.

Phil




home help back first fref pref prev next nref lref last post