On 2/12/2016 5:32 PM, Jacob Ritorto wrote:
Hi,
Seems I have bits 4 and 3 sticking on my Clearpoint QRAM-2-SAB-1 88b
4MB memory in my pdp11/73.
Can anyone offer hints as to how to identify which component is broken
and how to go about repairing this?
It's the only memory board in this machine, so I guess the problem
might actually be a bus or processor board, right? I have no other q-bus
memory to test with, so can't do swapping / process of elimination to be
sure.
Here's the manual:
http://www.arclightindustries.com/docs/Clearpoint-88B.pdf (which I probably
should add to manx or
archive.org or something).
Here's a snippet of the VMJA diags run illustrating bits 4 and 3
sticking. During the next VMJA run, all addresses were showing up as
errored instead of just the ones ending in xxx000xx, so I guess it's
getting worse!
@173000g
Starting system
BOOTING UP XXDP-XM EXTENDED MONITOR
XXDP-XM EXTENDED MONITOR - XXDP V2.5
REVISION: F0
BOOTED FROM DL0
124KW OF MEMORY
NON-UNIBUS SYSTEM
RESTART ADDRESS: 152000
TYPE "H" FOR HELP !
.R VMJA??
VMJAB0.BIC
CVMJAB0 ECC/PARITY MEMORY DIAGNOSTIC
11/83 CACHE AVAILABLE
SWR = 000000 NEW = 000040
CSR MAP
CSR 0 1 2 3 4 5 6 7 8 9 A B C D E F
MEMTYPE P
CSR NUMBER 0 CONTROLS TOO MANY BANKS
2044K OF Q-BUS PARITY MEMORY
2044K WORDS OF MEMORY TOTAL
MEMORY CONFIGURATION MAP
16K WORD BANKS
1 2 3 4 5 6 7
012345670123456701234567012345670123456701234567012345670123
ERRORS
MEMTYPE PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP
CSR 000000000000000000000000000000000000000000000000000000000000
PROTECT PP
1 1 1 1 1 1 1
0 1 2 3 4 5 6
456701234567012345670123456701234567012345670123456701234567
ERRORS
MEMTYPE PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP
CSR 000000000000000000000000000000000000000000000000000000000000
PROTECT
1
7
01234567
ERRORS
MEMTYPE PPPPPPPP
CSR 00000000
PROTECT
MEMORY DATA ERROR
PC BANK VADD PADD GOOD BAD XOR CSR MTYP INT PAT
027606 10 060000 01000000 000010 000030 000020 0 P 27
027606 10 060002 01000002 000010 000030 000020 0 P 27
027606 10 060004 01000004 000010 000030 000020 0 P 27
027606 10 060006 01000006 000010 000030 000020 0 P 27
027606 10 060010 01000010 000010 000030 000020 0 P 27
<<
SNIP >>
Well clearly it is only affecting certain address bits - or the
diagnostic would not run at all - note that it is starting at 010000000,
so that points to the memory, rather than the processor or bus, at
least as a first approximation. No guarantees, but I'd sure start with
that as a working theory.
Another sign: this is right at the boundary between two rows.
If you can't find a schematic, you can use the address to identify the
address lines on the bus (See Table 3, page 1-5), and trace them on the
board to find the relevant row of chips. Then use the bits the same way
to identify the specific chips.
If the chips are in sockets, you could always pull them one at a time to
find the relevant place in the array, as well.
...
Are you seeing the parity error light when this occurs?
Anyway, once the relevant chip(s) are identified, if they are in sockets
you can swap them with other bits or the same bits in other rows to
confirm. Otherwise you get to unsolder the suspects, and put in new ones.
JRJ
An old trick we use for testing soldered in DRAM is to simply jam a
known-to-be-good DRAM on top of the suspect one (legs bent in to make
good contact). DRAM normally fail bits high and so putting a good one on
top causes nothing different to happen if the suspect is good, but if
the suspect is bad then the top DRAM will drive the output and your RAM
test will pass.
Of course you wedge the good one on the suspect when the power is off.
Unless you are in a rush, and willing to possibly kill your test DRAM.
As a side note - there appears to be an error message: "CSR NUMBER 0
CONTROLS TOO MANY BANKS" Or is that irrelevant? I know nothing about
the PDP-11 test messages...
John :-#)#