To be clear:
 My PDP-8 currently does 50000000 additions/second. Or 100000000
 operate or jump instructions/second. 
 Well that does make it the fastest 8 on the planet. 
 
There's ONE problem: SIMH on a modern PC runs MUCH faster!
  I am lucky if I can get normal memory speeds ( min
1970's )  with my
 CPLD design 
Hm... Ok. When I turn on a real PDP-8, 1200ns is normal. But today...
  but most the problem is that I have to use slow I/O
chips. I don't use
 pipelining
 with my design so I figure I have about 4 gate delays for control
 signals, 4 gate delays
 for datapath logic and 4 x 2 gate delays for carry to ripple.  
I have much more gate delays:
=========================================================================
 Timing constraint: Default period analysis for Clock 'clk_50mhz'
   Clock period: 20.448ns (frequency: 48.905MHz)
   Total number of paths / destination ports: 48839 / 1336
 -------------------------------------------------------------------------
 Delay:               10.224ns (Levels of Logic = 8)
   Source:            cpu/pdp8_registers_1/major_state_reg_3_1 (FF)
   Destination:       cpu/pdp8_registers_1/ac_reg_10 (FF)
   Source Clock:      clk_50mhz rising 2.0X
   Destination Clock: clk_50mhz rising 2.0X
   Data Path: cpu/pdp8_registers_1/major_state_reg_3_1 to cpu/pdp8_registers_1/ac_reg_10
                                 Gate     Net
     Cell:in->out      fanout   Delay   Delay  Logical Name (Net Name)
     ----------------------------------------  ------------
      FDCE:C->Q             2   0.720   1.072  cpu/pdp8_registers_1/major_state_reg_3_1
(cpu/pdp8_registers_1/major_state_reg_3_1)
      LUT3_L:I1->LO         1   0.551   0.126  tty/_xor00001_SW0 (N4280)
      LUT4:I3->O            7   0.551   1.134  tty/_xor00001 (tty/_xor0000)
      LUT4:I2->O           14   0.551   1.213  tty/_mux005311 (io_cmd_ac_or)
      LUT4:I3->O            1   0.551   0.000  cpu/ac_next<10>6_G (N4461)
      MUXF5:I1->O           2   0.360   0.903  cpu/ac_next<10>6
(cpu/ac_next<10>_map3256)
      LUT4:I3->O            1   0.551   0.000  cpu/ac_next<10>111_SW0_F (N4426)
      MUXF5:I0->O           1   0.360   0.827  cpu/ac_next<10>111_SW0 (N4272)
      LUT4:I3->O            1   0.551   0.000  cpu/ac_next<10>135
(cpu/ac_next<10>)
      FDCE:D                    0.203          cpu/pdp8_registers_1/ac_reg_10
     ----------------------------------------
     Total                     10.224ns (4.949ns logic, 5.275ns route)
                                        (48.4% logic, 51.6% route) 
The times are
only the synthesis estimates! In reality, it gets other timings (currently it wants
17ns - but runs with 10 *g*)
  With the
 cheap chips I have
 ( 20 ns delay ) I have no problem with creating a computer up to 2 Mhz
 ... nice and slow 
20ns per stage? Ok, I understand...
  The 20 bit cpu alas can't be a (fictional) single
chip cpu since I can't
 fit it into a 48 pin dip.
 I am two pins extra. [1] 
?? There are other packages :-) Xilinx Spartan-II,
Spartan-3 and Spartan-3e at least
can be found in handy 144 pin packages. They *can* be soldered manually. It's easier
than
everybody thinks. BGAs are quite impossible... And all the bigger Chips (>400k) are
only available
in BGA package :-(
Philipp :-)
--
http://www.hachti.de