[2359] in cryptography@c2.net mail archive

home help back first fref pref prev next nref lref last post

Re: comparing CPUs running blowfish [Re: fastest blowfish.asm?]

daemon@ATHENA.MIT.EDU (Adam Back)
Wed Mar 25 10:06:33 1998

Date: Wed, 25 Mar 1998 11:46:30 GMT
From: Adam Back <aba@dcs.ex.ac.uk>
To: eay@cryptsoft.com
CC: peterg@kcbbs.gen.nz, coderpunks@toad.com, cryptography@c2.net
In-reply-to: <Pine.GSO.3.96.980325150544.2087V-100000@pandora.cryptsoft.com>
	(message from Eric Young on Wed, 25 Mar 1998 15:41:27 +1000 (EST))


Eric Young <eay@cryptsoft.com> writes:
> On Wed, 25 Mar 1998, Adam Back wrote:
> > This could be read as a story about the vagaries of attempting to tune
> > assembler language on the various pentiums and clones...
> 
> :-), you think that is bad, try playing with the bf_opts.c program.
> It tweaks the different build options available for the inner loop in the C
> code.  For my current version (0.8.2?), on a pentium 133, NT,
> 
> options    BF ecb/s
> <nothing>    243567.89 100.0%
> ptr          220833.43  90.7%
> ptr2         120301.05  49.4%

(was that a non MMX p133?)

!! I used ptr2 because it didn't seem to make much difference with the
AMD K6, and because this comment in the Makefile of 0.8.2b is wrong!:

#OPTS= -DBF_PTR2 # use for pentium

as is the force to auto-select BF_PTR2 in the bf_locl.h file.  (Yes I
had noticed that -- it splurges redefinition warnings when compiling
with the script I posted, and I was trying to turn BF_PTR2 on not
off).

> I strongly suspect I have the default C options for gcc-x86 stuffed up in
> the 0.8.2 release.

Yes you do, you have missed off "-m486 -DCPU=pentium", though this
seemed to make little to no difference with -DBF_PTR2 / AMD k6.  This
discovery was the reason for my explicit compile script, to force the
options to be the same for both distributions, I thought initially the
slower 082b code was the difference in compile options.  It is not, as
I am compiling with -DBF_PTR2 -m486 -DCPU=pentium -fomit-frame-pointer
-O3 for all.  So my statement on apparent slowness of 082b C code
holds but applys to the BF_PTR2 compiled C code on 082b being a lot
slower than the 072m C code compiled with the same options, which
is uninteresting though because BF_PTR2 is the worst option.

> > Follows is timings of a AMD K6 MMX, AMD K5 and Intel MMX all clocked
> > at 166 Mhz.
> 
> I can do pentium 100/133 and ppro 200.  What program are us using
> for timing?  For the fast ciphers, the byte/pack/unpack overhead can
> cost alot, so the specific function being used matters quite a bit.
> I assume the speed program?

Yes bfspeed.  I will retry without BF_PTR2.

Adam

home help back first fref pref prev next nref lref last post