On May 5, 2013, at 5:01 PM, Tothwolf <tothwolf at concentric.net> wrote:
So I'll
have to check that out. It could be something as simple as balky SCSI termination, since
I do have 4 devices in the chain and the terminator is a Jaz drive using internal
termination. I'll play with it some. Hopefully it's not the hard drive going
bad, since my supply of SCSI drives is essentially dry. It's a bit worrisome that
it's the PU device going bad rather than the DU device, but I'm not 100% sure how
VMS logs the errors, so it could be what I suspect.
How long is the SCSI chain? The first thing that came to mind for me was the possibility
of a SCSI bus issue since I've seen similar behavior with other systems. In fact,
since you just mentioned the Jaz drive, it could very well be the culprit. I had major
compatibility issues with Jaz drives in a non-PC application back when they were current
products and I ended up having to connect them to a PC to update their firmware and change
their internal settings. A quick Google search turned up this link too:
http://www.linux-m68k.org/faq/howjaz.html
I'm familiar with the general problems with Jaz drives, but this particular
one has never given me much trouble. Having workable disks around is a
problem, though; they all seem to die over time, even disused (unlike Zip
disks, which have always been fine for me). I got 10 off eBay and they all
failed a long format, which I hope doesn't actually mean the drive in the
Mac I was formatting them has gone off (not impossible). At this point, I
don't think I even have any Jaz disks that pass the long format.
In any case, the SCSI chain isn't hideously long; 4 devices, all connected
with 3-foot cables, so it should be well within spec. Nice, thick cables
as well. Disconnecting the Jaz drive doesn't seem to have solved the
problem; next I'll try the Zip drive. It's also altogether possible that
the update to the latest firmware on the CQD-220 has introduced some
instability in the card itself, since the error logs seem to be indicating
that the controller needs to get reset.
Specifically, I'm getting this error once in a while:
******************************* ENTRY 40. *******************************
ERROR SEQUENCE 44. LOGGED ON: SID 0A000006
DATE/TIME 6-MAY-2013 11:49:18.76 SYS_TYPE 01530302
SYSTEM UPTIME: 0 DAYS 02:38:10
SCS NODE: GONDOR VAX/VMS V7.3
ERL$LOGMSCP KA655 CPU FW REV# 7. CONSOLE FW REV# 5.3
MESSAGE TYPE 0010
IMMEDIATE MODE COMMAND TIMEOUT
_ CONTROLLER RESET
CLASS DRIVER 4B534944
/DISK/
CDDB$Q_CNTRLID 00D60000
010D0000
UNIQUE IDENTIFIER, 000000D60000(X)
MASS STORAGE CONTROLLER
KDA50-Q
CDDB$B_SYSTEMID 80C18E68
8000
It's followed by a few other entries documenting the reset and init sequence.
My thinking is that the controller shouldn't be timing out on MSCP commands
just because of some balky devices on the SCSI bus, though having seen some
of the failure modes of the firmware of the CQD-220, I wouldn't be entirely
surprised.
I wonder if Glen ever ended up updating the firmware on his, and whether
similar problems began to manifest. I might try dropping down to the A8 rev
tonight to see if that improves the situation; now that I have a better CD
drive, I don't really need the updated firmware (though it does have some
nice features).
- Dave