Running HPL on
a modern x86 processor tends to disagree with you (though
on the recent 6+ core chips, memory wait can be significant if you're
pushing all the cores).
You work in a supercomputer environment...you're running far more modern
stuff than most of the rest of the world. I'm talking about the
HT/Replay chips. Think 2.4GHz 32-bit Pentiums. Those are the boxes that
I see everywhere.
I would say most of the wait is still 'dumb' I/O to process the windows
user interface.