It was thus said that the Great Michael B. Brutman once stated:
When you had a 4Mhz machine with 640KB each cycle and each KB of ram
were precious. I spent an entire day tuning a loop that computes UDP
checksums on a 4.77Mhz machine because my measurements showed that a
good deal of my time was in that loop. It was worth rewriting the
perfectly good C for loop into inline ASM to squeeze a few hundred
thousand cycles per 1K packet.
I remember my friend writing a maze generating program (drew a maze on the
320x200 graphics screen) in BASIC that took 20 minutes to run on a PCjr. He
then rewrote it in Turbo Pascal, cutting the time down to 5 minutes. I then
took the Pascal code and spent a few hours rewriting it in Assembly---cut
the time to about 30 seconds if I recall (although my random number
generator left something to be desired---about 10% of the time it would
break down something awful).
A few years later, I rewrote it yet again in C, for a 33MHz SGI
(effectively the same resolution---a 320x200 window) that ran in less than 5
seconds. If I drew the maze in memory then copied it, less than a second.
Time marches on.
A program I wrote in said same SGI took almost a year to run. A few
months ago the same program would have taken I think 2 days to run on a
quad-Pentium machine.
-spc (Still miss that SGI machine ... but I don't miss the hardware
headaches I had with it ... )