On Mon, 2025-02-17 at 09:17 -0500, Paul Koning via cctalk wrote:
Also multiple functional units, seriously interleaved
memory, and a
bucket full of other tricks. The way loads and stores are requested
by the programmer naturally makes them background operations, and the
"stunt box" handles that background process.
I remember the "stunt box" also being called the "traffic cop."
The Denelcor HEP had asynchronous memory access. It had several
(sixteen, IIRC) functional units and hardware thread switching. When a
memory access occurred, the register file was saved (or maybe there
were more register files than hardware processors — my memory is foggy
here) and another thread was put into a functional unit. When the
memory access was completed, the register file was put into the
functional unit queue.
Arvind at MIT was a dataflow investigator. Greg Papadopolous (later
Chief Technical Officer at Sun Microsystems) developed the Monsoon
tagged-token dataflow computer as his PhD project. Rishiyur Nikhil
described a RISC architecture that was augmented with asynchronous
memory access and automatic thread switching. Burton Smith and James
Rottsalk founded Tera Computing to develop a computer called the Multi
Tread Architecture or MTA, IIRC based on Nikhil's ideas. They bought
the ashes of Cray and promptly changed their name to Cray. But the MTA
had a fatal flaw that neither Tera nor the Cray engineers they absorbed
were able to resolve: It had a 100 MHz bottleneck. Even so, it was
faster than the "supercomputer" that IBM was offering at the time — but
only on a sort benchmark.