From: Paul Koning
The general rule of device drivers is to assume that
the hardware is
misbehaving, and double check everything.
Right, but it needs to _actively log_ when it has to fix something, otherwise
you wind up in the situation of the semi-famous old Multics problem where
(IIRC) the system was running slower and slower... finally someone looked at
the Disk DIM (driver) counters, and one drive was slowly failing, but the
industrial-strength recovery code in the Multics Disk DIM was masking the
problem (except for the performance degradation). The DIM was thereupon
modified to notify the operator if 'too many' retries had to be done 'too
often'.
Noel