[457] in cryptography@c2.net mail archive
How bad is this?
daemon@ATHENA.MIT.EDU (Colin Plumb)
Tue Apr  1 23:52:03 1997
Date: Tue, 1 Apr 97 21:33:02 MST
From: colin@nyx.net (Colin Plumb)
To: cryptography@c2.net
I've been trying to come up with a very fast, and not necessarily
very secure hash function for TCP initial sequence number selection.
(That means fast when *not* in an inner loop and doesn't thrash the cache.)
I couldn't find anything off-the-shelf that was a lot faster than MD4,
so I entered the dangerous waters of designing my own.  (It's a good
thing that this isn't a high-security application!)
Does anyone see how the following is obviously lousy?
(Does anyone know something even faster?  The idea is to have
a 32-bit random function of a 96-bit value, which is often
achieved using a hash of the 96-bit value and some secret
key material.)
-- 
	-Colin
#include <string.h>
#include <assert.h>
/*
 * A do-it-yourself hash, cobbled together out of
 * bits from various sources.  Speed is a more important
 * criterion than security.
 */
typedef unsigned word32;
#define ROTL(x,s) (((x)<<(s)) + ((x)>>(32-(s))))
/* Some parameters of the key-scheduling pass */
#define CONST	0x9e3779b9	/* Golden ratio */
#define POLY	0xEDB88320	/* CRC-32 polynomial */
/*
 * A hash function is a cipher like this embedded in a certain kind of
 * feedback.  This is just the cipher.
 *
 * This is a wierd thing of my own creation including aspects of
 * SHA, TGFSRs, RC2 and Bob Jenkins' LOOKUP2.
 * It's hardly massively secure, but it's plenty good enough for this
 * application.
 * 
 * It has a 9-word key size and a 3-word block.
 */
void
HashTransform(word32 xkey[27], word32 state[3])
{
	register word32 a, b, c;
	unsigned i;
	static word32 const twist_table[8] = {
		 CONST,
		 CONST ^                      (POLY >> 2),
		 CONST ^        (POLY >> 1),
		 CONST ^        (POLY >> 1) ^ (POLY >> 2),
		 CONST ^ POLY,
		 CONST ^ POLY ^               (POLY >> 2),
		 CONST ^ POLY ^ (POLY >> 1),
		 CONST ^ POLY ^ (POLY >> 1) ^ (POLY >> 2)
	};
	/*
	 * Perform key scheduling pass - TGFSR, inspired by SHA.1
	 * Well, to get better mixing faster we twist by three bits at a time.
	 * The "outer" polynomial is x^8 + x^7 + 1
	 * (Can this be improved?)
	 */
	a = xkey[7];
	b = xkey[8];
	for (i = 9; i < 27; i += 2) {
		a ^= xkey[i];
		xkey[i+ 9] = a = (a >> 3) ^ twist_table[a & 7];
		b ^= xkey[i+1];
		xkey[i+10] = b = (b >> 3) ^ twist_table[b & 7];
	}
	/* Now do the actual hashing */
	a = state[0];
	b = state[1];
	c = state[2];
	/* This is basically Bob Jenkins' LOOKUP2 operation */
	a -= b; a -= c; a ^= (c>>13) ^ xkey[ 0];
	b -= c; b -= a; b ^= (a<<8)  ^ xkey[ 1];
	c -= a; c -= b; c ^= (b>>13) ^ xkey[ 2];
	a -= b; a -= c; a ^= (c>>12) ^ xkey[ 3];
	b -= c; b -= a; b ^= (a<<16) ^ xkey[ 4];
	c -= a; c -= b; c ^= (b>>5)  ^ xkey[ 5];
	a -= b; a -= c; a ^= (c>>3)  ^ xkey[ 6];
	b -= c; b -= a; b ^= (a<<10) ^ xkey[ 7];
	c -= a; c -= b; c ^= (b>>15) ^ xkey[ 8];
	/* This is stolen from RC2. */
	a += xkey[c & 15];
	b += xkey[a & 15];
	c += xkey[b & 15];
	a -= b; a -= c; a ^= (c>>13) ^ xkey[ 9];
	b -= c; b -= a; b ^= (a<<8)  ^ xkey[10];
	c -= a; c -= b; c ^= (b>>13) ^ xkey[11];
	a -= b; a -= c; a ^= (c>>12) ^ xkey[12];
	b -= c; b -= a; b ^= (a<<16) ^ xkey[13];
	c -= a; c -= b; c ^= (b>>5)  ^ xkey[14];
	a -= b; a -= c; a ^= (c>>3)  ^ xkey[15];
	b -= c; b -= a; b ^= (a<<10) ^ xkey[16];
	c -= a; c -= b; c ^= (b>>15) ^ xkey[17];
	a += xkey[(c & 15) + 11];
	b += xkey[(a & 15) + 11];
	c += xkey[(b & 15) + 11];
	a -= b; a -= c; a ^= (c>>13) ^ xkey[18];
	b -= c; b -= a; b ^= (a<<8)  ^ xkey[19];
	c -= a; c -= b; c ^= (b>>13) ^ xkey[20];
	a -= b; a -= c; a ^= (c>>12) ^ xkey[21];
	b -= c; b -= a; b ^= (a<<16) ^ xkey[22];
	c -= a; c -= b; c ^= (b>>5)  ^ xkey[23];
	a -= b; a -= c; a ^= (c>>3)  ^ xkey[24];
	b -= c; b -= a; b ^= (a<<10) ^ xkey[25];
	c -= a; c -= b; c ^= (b>>15) ^ xkey[26];
	state[0] += a;
	state[1] += b;
	state[2] += c;
}
#include <stdio.h>
#include <time.h>
int
main(void)
{
	int i, j;
	clock_t ticks;
	word32 xkey[27];
	for (j = 0; j < 10; j++) {
		ticks = clock();
		for (i = 0; i < 100000; i++) {
			HashTransform(xkey, xkey);
		}
		ticks = clock() - ticks;
		printf("%lu ticks\n", ticks);
	}
	return 0;
}