< As far as I know, no x86 or 68xxx processor has ever had any degree of
< fault tolerance withing the chip, where it is needed. Getting it externa
Big time problem. Fault detection and instruction resequencing arent
there. That makes even a simple parity error in ram unmanageable.
< Most mainframes, even ones from the 1960s, have error checking throughou
< the entire system - even in the paths between the registers and ALU. If
I was told a story about an old 700 series where the cooling water for
the rooms chiller found a leak. The leak was in the hundreds of gallons
a minute rate. Oh, the 700 series is a vacuum tube machine so the under
floor cable troughs have data and power cables galore. Seems the machine
was still running fine when water started gushing out the bottom pannels
of the racks. All the interconnection cables and PS units were soaked and
it still ran! They shut down, fixed the pipe and dried the room and fired
everything back up no problem.
It was a general presumption that the machines due to the large number of
parts would be unreliable. The designs were robust to say the least and
in practice they were reliable, often far better than predicted.
< something goes wrong, like a gate goes into a "stuck at" condition, the
< redundant circuits and error correctors will jump into action and
< processing will not stop. Most machines will call home and have
Fault tolerence is an art in itself.
Allison