Michael B. Brutman wrote:
I know of stack machines in theory only, so I'll
take your word on it.
None of the hardware I've used in the last 20 years (Z80, 6502, 6800,
x86, PPC, etc.) would qualify as stack machines, and hence my lack of
faith. :-)
Stack based machines can be very fast, and depending on their design
much faster than IA32/IA64. The idea is to use the stack instead of
registers, or as registers, and to have a large stack. Most of the
stack lives in L1 cache, and is almost as fast as a large RISC register
file.
Since C/C++/Java and the like do rely upon the stack heavily, fast stack
access is going to buy you a lot more wins. The next closest thing is
register windows as used by SPARC.
The reason for this is that if you look at compiled C, you have a
function calling another, which in turn calls another which may invoke
an interrupt to talk to the OS. At each layer, each function is popping
a bunch of parameters off the stack which were passed to it, doing some
work, then pushing a bunch of parameters onto the stack before calling
the next function. All the writes/reads to the stack are the huge
bottleneck.
Even worst, if you have a bunch of wrapper libraries, you'll find that
the function takes its parameters, and passes them onto the functions
that it in turn calls! This means that you duplicate the data on the
stack several times!
See:
http://www.csclub.uwaterloo.ca/media/Eric%20LaForest:%20Next%20Generation%2…
There's a large video presentation explaining some of these architectures.
The idea is that for General Purpose registers, you don't talk to actual
registers, but rather to the stack. i.e. referencing r0 goes to the
memory pointed at by the stack pointer, r1 is addressed at r0+32 (or 64
bits) off the SP, r2 and so on.
If you have a CPU that speeds that whole stack song and dance up in any
way whatosoever, you have a big win. This is one reason why when you
have a CPU intensive application, you want to go with a SPARC platform,
not with IA32/64. On SPARC you almost don't have to push anything on
the stack as long as you have less than 6 parameters to your functions.
It splits up the register file into Global, In, Local, and Out. When
you call a function the Out registers are instantly mapped (not copied)
the function's In registers. So function calls are very very fast.
(At some point, you run out of registers - either an overflow or
underflow, but that generates an IRQ which allows the OS to copy or
restore the register file to/from somewhere in memory.)
see:
http://www.sics.se/~psm/sparcstack.html
https://www.cs.tcd.ie/Martin.Emms/Logic/Sparc/SparcNotes/node8.html
http://www.cs.clemson.edu/~mark/subroutines/sparc.html
In my own personal experience, the only machine that I've ever come
across that was a real stack based architecture was the AT&T Eo which
used the Hobbit processor.