Since the computer doesn't mind doing boring, repetitive tasks, I set the
11/23+ to yet again remaking the bootable RL02 pack (3 hrs at 9600 baud
while I did other things).
Plunked it in (without the programmer's panel connected), hit the boot
switch on the limited function panel, and it fired right up to the "."
prompt.
Did a directory of the free space which printed, set the date to 11 Sep 75,
and had a phone call for a while.
Back to the console, entered R BUILD (just to do the PR, not save it) and
system halts. Reboot - back to the same crap of fault light flashes on the
drives, no boot. Sigh.
So I hooked up the programmer's panel, flaky though it may be, (just not
touching the SR) and discovered that some of the bootstrap routine is being
corrupted in zero page!
And not the same Bit 4 as before, and not all the time... a 4027 (JMS IO)
got changed to 4007, the constant of 0377 changed to 0017, the 6601 RL02
instruction was now a 6401, etc.
Still always middle bits though.
Now, the secondary bootstrap (from the RL pack) does overwrite the zero page
including the running boot code. (The first read from the pack is 200[octal]
words, or one page). The boot routine never sets the MA address register so
it could be overwriting core starting at 0000.
The boot listing comments say execution does not continue in the primary
boot (at 0001-0035), from the last IO call at the point when the RLCB
function causes the page of data to be read from the disk into core. But the
changes I'm seeing aren't right, regardless. The OS/8 boot routine would
never use a 6401 (an IOT for a secondary console device, in this case my
Omni-USB)...
Next I pulled the RL8A out of the backplane and tried manually entering and
single-stepping through the boot code, skipping manually over the RLSD
instruction in the disk IO subroutine.
That didn't seem to corrupt the code. Hit the Boot key (with the RL8A still
removed) and those three or four words are wrong again, the same way. Thus
verifying that the changes are not coming from the secondary boot on the
disk!
If I remember correctly, the boot ROMs have to load the boot routine into
core at 0001-0035 (done by hardware on one of the three-board set) and then
start execution at 0001.
I thought this might be bit-rot in the ROMs, so I tried toggling in the
bootstrap code to those locations and single-stepping beginning at 0001
after a few loops through showed the code changed again.
Remember, the RL8A is not present in the bus. So the fault can't be on that
card!
I'm thinking something else is loading part of the memory data bus when it
should not be, which is either on the CPU or it's a problem with the 32K
memory card itself.
Now that I think about it, when I discovered the problem on our clone panel
a couple of days ago, I also found a couple of core locations that had been
corrupted in the restrl program, not in page 0 though. Figured that was also
artifact of the buggy panel, but now it's looking like something else. The
only things in the backplane now are the 3-board set, the 32k card, and the
Omni-USB.
Today I have discovered a pattern to the corruption!
Addr Orig. Altered
0005 4027 4007
0011 6615 6415
0015 7325 7005
0021 1026 1006
0025 0377 0017
0031 6601 6401
Notice the middle 4 bits are always being set to 0000 which is an
open-circuit (on the Omnibus a logic-1 is pulled down to 0 volts).
I am not sure of the significance of the repeating address pattern yet.
Not only that, those locations *only* are wrong after the BOOT key (or
switch) is toggled.
Running from loc. 1 by (0001, LA, RUN) does NOT corrupt the bootstrap in
core (with the RL8A still removed, so the DMA facility doesn't factor into
it).
An even more important finding is that if I manually clear the bootstrap
code by depositing 0000 in all its locations, when I hit the BOOT key, it
deposits the "altered" version into core! No wonder the machine won't
boot... still unsure as to why it did the first time yesterday, though.
So now I need to look at the boot ROM circuitry which is on Option 2 board
M8317. There are not three ROMs, DEC used some weird packing scheme to fit
into two 256x4 ROMs. Most likely there is a 4-bit latch or an open-collector
buffer chip that is flaky.
I'd just yank the board and try booting without it, but the IF and DF
registers (for memory extension beyond 4K) are also on that card so OS/8
would crash...
And the best news of all so far:
I keyed in the boot loader by hand, started it manually at 0001... and OS/8
booted from Drive 0 which confirms that the boot ROM area is the cause!
Now to track it down... I sure hope it's not another solder whisker.
Meanwhile, I used the system for over half an hour, running PFOCAL,
formatting disk packs in Drive 1, checking the handlers loaded with BUILD.
All working.
-Charles