On May 21, 2020, at 3:56 PM, Jay Jaeger <cube1 at
charter.net> wrote:
On 5/21/2020 11:55 AM, Paul Koning wrote:
...
This sort of question is why I found starting with the simulator is helpful. In a
simulation you can specify delays directly. So for my 6600, I have the gate delay (5 ns)
and the wire delays (1.3 ns per foot, in the twisted pair, or 25 ns for coax cables
including tx/rx circuit). Actually, I only include wire delays for "long"
wires; the design clearly uses wires longer than needed in various places for delay
reasons, but my guess is that short wires are not time sensitive. That may be wrong; I
need to run it again without that assumption to see if it helps.
I do indeed plan on starting with simulation - just not for that reason.
Its just easier to debug then the FPGA proper. ;)
I suppose I could figure out the actual wire list, and thus wire
lengths, but it would be have to be limited only to inter-panel wires,
and even that much would be painful and very time consuming. But yeah,
it makes sense to model gate delays in a general way and then perhaps
lower them to see what happens at higher speeds, as you suggest below.
As I said, for most machines it is not likely to be useful to model wire lengths. For the
6600, it is mandatory, and obvious from the design files: when you see a 96 inch wire
connecting two modules one inch apart, you know there is a reason why that wasn't a 4
inch wire instead. Not to mention when "adjust to get x ns pulse" shows up in
an annotation.
One good example of this is the master clock oscillator. In most 6600s it's a 10 MHz
crystal clock, but in the first 7 units built, it's a 4 stage ring oscillator, with 96
inch wires to produce 25 ns delay between the four main phases. The later version gets
rid of the ring oscillator, but retains the four buffers with wire delays, as n * 25 ns
phase shifters.
If I were working on, say, a PDP-11, I wouldn't expect to have to deal with any of
this sort of craziness. But a 6600 is at, if not over, the hairy edge of possible speed
for when it was built. Even the peripheral processors are well optimized, so they can run
many of their instructions in 1 microsecond -- which is the memory cycle time. Fetching
and executing a new opcode every cycle is a pretty hard task. The PPUs are actually
pipelined, though the descriptions you'll read about them don't make this clear at
all.
Come to think of it, another nice optimization example is the process context switch,
which in the 6000 series is a single instruction -- 16 read/modify/write core memory
cycles 100 ns apart, so the whole thing including pipeline drain and restart takes 3 or so
microseconds. Watching that in simulation is fun.
paul