Just a quick history of x86 implementation styles (from memory, so don't
take this very seriously):
8086: Intel's first pipeline, with separate Fetch and Execution units
iAPX286: borrowed some ideas from iAPX432's protection model, but I
don't know any implementation details
386: traditional CISC pipeline
486: main "RISC" pipeline for most popular instructions wih microcode
support for the rest (some of the least popular instructions were
actually slower than on the 386)
Pentium: dual "RISC" pipelines, one of which had microcode support for
the rest of the instructions
Nx586 from NextGen: internal RISC instruction set with hardware
translation from x86
Pentium Pro: like Nx586 (internal instructions called "micro ops")
K6 from AMD: AMD bought NextGen and updated their technology
Pentium II and III: improved Pentium Pro
Pentium 4: the "microburst" architecture was optimized for highest
possible clocks. The hardware translator for x86 to micro ops was placed
before the instruction cache instead of after, which allowed some
interesting optimizations
Core, Core 2 and so on: though at first sight it seems like the x86
instructions are translated into RISC "micro ops", the internal
execution engine is actually a DataFlow machine instead. This is called
out of order (OOO) execution and the reason this isn't obvious is
because the bits that link the instructions together are spread out
among different hardware parts and not all together with the
instructions themselves as in classical DataFlow architectures.
Having direct access to this internal OOO engine would not really help.
Being able to bypass it entirely could be interesting since it takes a
lot of energy and transistors to do its job, and this is what the Mill
architecture is trying to do:
http://millcomputing.com/docs/
-- Jecel