PDP-11/45 RSTS/E boot problem

Fritz Mueller fritzm at fritzm.org
Sat Jan 19 15:17:10 CST 2019


Happy weekend, all!  Latest updates on this issue:

Identified and replaced a faulty 4116 DRAM (E204) on my MS11-L.  After this, my small hand-rolled standalone diagnostic passes the full 256K.  I'll post my diagnostic source over on my blog soon.

After this repair, tried MAINDEC ZQMC, called out as the appropriate diagnostic by the MS11-L docs.  This was interesting...  First, it would barely run at all unless I disabled parity checking with front panel switch settings.  Second, it flagged a bunch of memory locations that weren't reported by my much simpler diagnostic (which only does all-ones/all-zeros passes looking for stuck bits at this point.)

The MAINDEC memory diagnostic is bulky and complicated, and it takes several minutes to re-download it after a power cycle, so it's not exactly convenient to use while troubleshooting.  I'll probably be beefing up my smaller diagnostic with a few more tests (including parity).

Went ahead and tried both RSTS and Unix again after the above repair, and saw the same fault behaviors from both (sadness).  Oh well, not there yet...

So, smokiest gun I have right now is the parity issue.  Could be I still have a bad DRAM on my MS11 in one of the parity banks...  I tried enabling trap on parity error in the MS11 CSR before running my diagnostic, but it didn't trap, even though it did flag parity error(s) in the CSR.  So maybe I *also* have a bug I haven't yet addressed in parity handling within CPU.  I realized there is a MAINDEC specifically for this (CKBR) which I had previously overlooked. May give that a look today.  Also, parity is one significant difference between SIMH and my real hardware: SIMH emulates a memory system with no parity hardware.

Looking into the parity issue some last night has raised a few questions:

- There is a lot of inconsistent and incomplete information in the documentation about memory CSRs.  They appear to come in different flavors depending on memory hardware; some of the earlier ones support setting a bit to determine whether parity errors will halt or trap the CPU, while some of the later ones (like my MS11-L) simply have "enable" and don't distinguish between halt and trap.  I'm curious how OS init code sniffs out what memory CSRs there are, determines their specific flavors and, in a heterogeneous system, determines how much address space is under the auspice of each CSR?  Maybe Paul and Noel can comment here wrt. RSTS and Unix respectively?

- The 11/45 prints show a jumper (W1, lower left of sheet UBCB) that looks like it would entirely disable Unibus parity error detection if removed.  This was an obvious thing to check, but when I pulled and examined my UBC board (and also looked over my spare) no such jumper or any associated pads were anywhere to be found!  So maybe this was either added/removed from later etches of the UBC?  Anybody know more on this?

My UBC has required three separate repairs so far in the course of restoring this machine, in order to address various independent issues.  Now we may now be coming up on #4...  Based also on the rat's nest of green wires on these boards and the frustrated-looking engineer scrawl *all* over this page of the prints, the UBC really is the heart of darkness of the KB11-A :-)

  cheers,
    --FritzM.




More information about the cctalk mailing list