On 2024-06-10 10:18 a.m., Joshua Rice via cctalk wrote:
On 10/06/2024 05:54, dwight via cctalk wrote:
No one is mentioning multiple processors on a
single die and cache
that is bigger than most systems of that times complete RAM.
Clock speed was dealt with clever register reassignment, pipelining
and prediction.
Dwight
Pipelining has always been a double edged sword. Splitting the
instruction cycle into smaller, faster chunks that can run
simultaneously is a great idea, but if the actual instruction execution
speed gets longer, failed branch predictions and subsequent pipeline
flushes can truly bog down the real-life IPS. This is ultimately what
led the NetBurst architecture to be the dead-end it became.
The other gotya with pipelining, is you have to have equal size chunks.
A 16 word register file seems to be right size for a 16 bit alu.
64 words for words for 32 bit alu. 256 words for 64 bit alu,
as a guess.
You never see a gate level delays on a spec sheet.
Our pipeline is X delays + N delays for a latch.
How Fast Can Computers Add?
Scientific American
Vol. 219, No. 4 (October 1968), pp. 93-101 (9 pages)
I do not think that will change vs MORE's law, LESS's law,
BIG MONEY's law.
DEC came across another issue with the PDP-11 vs the
VAX. Although the
pipelined architecture of the VAX was much faster than the PDP-11, the
actual time for a single instruction cycle was much increased, which led
to customers requiring real-time operation to stick with the PDP-11, as
Forget that, noise. PDP 11's dirt cheap compared to VAX.
it was much quicker in those operations. This, along
with it's large
software back-catalog and established platform led to the PDP-11
outliving it's successor. Josh Rice
Now that makes more sense.