"Walter F.J. Mueller" <W.F.J.Mueller at gsi.de> wrote:
Hi there,
Brad Parker just posted about his FPGA implementation of a PDP-11,
which boots so far RT-11, RSTS V4, BSD 2.9 and Unix V6. There was a
question how fast an FPGA solution might be compared to a PDP-11/93.
I've also implemented a PDP-11 on an FPGA. It is a full 11/70 with
split I&D, MMU and cache. No FPP so far. Available peripherals are so
far DL11, LP11, KW11L, PC11, and RK11. All I/O is channeled over via
'remote-register-interface' onto a single bi-directional byte stream
interface, so the FPGA board needs a backend PC with a server program
to handle the I/O requests.
Cool! Nice!
The design is FPGA proven, runs on Digilent S3 and
NEXYS2 boards, the
former with 1 MB 10 ns SRAM, the later with 16 MB 70ns PSDRAM.
Resource consumption is
S3 board xc3s1000 2471 slices or 33%
NEXYS2 board xc3s1200e 2624 slices or 30%
The implementation was verified against many XXDP maindec's. There are
some open issues, especially some details of trap and double error
handling aren't correct yet. In practice this is of little importance,
the FPGA system happily boots and runs BSD 2.11, a system using 22bit
addressing and split I&D space. {Note: you need patch 447 for 2.11BSD
to get FPP emulation and RK support working}
Any plans on the FPP? It would be really nice and useful to have.
As for traps and double errors, feel free to ask. I don't know if I have
all the answers, but I might be able to figure them out. Besides, I also
have access to one (or three) functional 11/70 machines.
On Performance: The design runs at 50 MHz. I've
run parts of the
Byte Unix benchmark on the FPGA systems. Given that the FPP is only
emulated by the 2.11BSD kernel it makes only sense to look at the integer
benchmarks. The Dhrystone benchmark 'dhry2reg' gives about '11500 lps'
on both boards. For comparison see Michael Schneiders page
http://www.vaxcluster.de/mambo/bench2.php?mach=pdp11 which gives about
'830 lps' for a 11/53. There is little Dhrystone difference between
the two boards despite the very different memory access times. The
8 kB cache with 32 bit cache lines really helps on the NEXYS2 board.
The 11/53 is a really slow machine. Not that helpful to compare with.
But you seem to push a nice number anyway.
But 50MHz... The J11 in an 11/9x machine runs at 20 MHz, which would
suggest that you should only be able to push about 2.5 times the
performance, unless you do some more clever tricks.
(The 11/9x machine runs all memory as cache.)
I'm in the middle of homogenizing some internal
interfaces and of some
code cleanup, also the backend handler needs a re-write in C++ (currently
perl). When that's done I'll make the whole package (VHDL sources, test
benches, backend) available on 'OpenCores'.
Finally a comment to Dave Mitton's remark
> > Now what would be really cool would be to make 4 CPUs and re-create
> > an 11/74 quad.
> >
http://www.miim.com/faq/hardware/multipro.html#castor
*Sigh* I wonder if anyone is ever going to be able to set Bruce Mitchell
right on his facts.
CASTOR didn't disappear. I talked with Dave Carroll about it not so many
years ago, and the machine was still around, altough at that time with a
hardware problem causing it to be down.
(There are plenty of other small errors on Bruce Mitchells pages as
well, but from my small dealings with him in the past, he don't seem to
be interested in listening.)
The reason why I picked a 11/70 and not a J11 as
target is because my
goal is a 11/74. I've implemented the IIST already and tested against
the IIST Diagnostic I could find in XXDP (riiab0). A dual core will
fit into a single xc3s1200e of the NEXYS2 board. The work needed is
quite clear and doable (changes on cache, mmu, and cpu core for asrb).
However, I've no plans to implement the CIS, so it will always be a
subset of a 11/74. But for sure fun to do and run.
You do know that the J11 is already designed for mP usage, except that
DECs testing of that was even more secret than the 11/74?
The 11/74 definitely don't need CIS though. I don't think any prototype
11/74 even had it. It was planned for the next generation of the
machine, that never got built. Anyway, it was to be an option for the
CPU as far as I know. Just as FPP.
IIST is needed for RSX to be happy (the only OS that supports the
11/74), and you also need to implement parts of the memory bus behaviour
with interlocking. You can ignore the MK11 box CSRs, even though it will
look a little funny, but you do need separate DL11s for each CPU core,
along with the rest of the I/O bus, or else things will probably not
work. The 11/74 is a shared memory machine, but not shared I/O bus.
Johnny