On Feb 21, 2012, at 1:30 PM, Toby Thain wrote:
On 21/02/12 12:24 PM, Sean Conner wrote:
I did a test on another sub-section of our
codebase at work (this just
happened to have a decently written Makefile). First, a non-parallel make
(same system as above, 8-core SPARC system):
real 4m36.002s
user 4m3.075s
sys 0m29.713s
Now, a parallel make:
real 0m22.330s
user 7m54.787s
sys 0m42.245s
Something like 21 times faster.
Where do you get 21x from? I see a wall clock ratio of about 12x (still amazing; can you
explain why it's superlinear?)
It's superlinear because of the constraints of I/O (reading the file off the disk) vs.
CPU (compiling the file). Actually compiling the file (depending on language, of course)
can be miniscule compared to the overhead of reading off the disk. I typically use a
number of jobs equal to 1.5 the number of processors in the system; I have a 4-core i7
which masquerades as 8 due to hyperthreading (which they actually got right in the i7, as
opposed to the P4). I don't see a *lot* of difference between 9 and 12 jobs.
If I were to guess why it would be superlinear in the general case, I would imagine
it's because the header files then wind up in the file cache while you're using
them. I've never profiled it.
- Dave