Hi
I finally broke down and wrote a program to look at using
CRC16 to fix errors. Here is what I have found.
1. If the error is only a single bit, one can fix the
error 100% of the time for data of 1K bytes and less ( I didn't
look at larger data sizes but expect it to be all the way
up to (2^16)-1. )
2. If two bits in a row are in error, you can still fix 100%
for data sizes up to 1K bytes.
3. For errors that span 3 bits in a row:
If all three bits are in error, 100% for 1K bytes
If two bits separated by a correct bit ( Still 3 bits
wide but center bit correct ), there is an ambiguity
between that and a 2 bit bit error next to each other. If the
data set is 128 bytes, it is still 100% correctable.
If the error shows as a mask of 101b and it is within
7140 bits of the end of data, it is still correctable.
If the error shows as a mask of 011b and it is within
7140 bits of the beginning of the data ( for 1K bytes ),
it is also unambiguously corrected. This is still most
of the errors of this type but leave a window for two
separate error locations. At least, you know where it
works and where it doesn't.
Things get messier when one opens the window to 4 and 5 bits.
Still, depending on the location, one can restrict the possibilities
to only a pair of possible error locations and for some locations
and errors, it is unambiguous. This is because of the cyclic nature
of the CRC. Knowing the span between these alias fixes, one
can still do corrections for errors in particular location ranges.
If one knows where the burst is located, such as when one finds
flaky bits ( like the way Tim does it ), one can also set a window
around the bits to find close bits that are also failing. Hopefully,
the error is located closely around the flaky bits. If so, one can
get a map of the bits that are the only ones that can cause the
specific sum. One could expand this to 16 bits but then you
wouldn't know the how to center the window. It could be anywhere
from the burst being centered on the flaky bit of
offset so that
the flaky bit was on one side or the other. Still, given this
information it will give one a lot more than just using one
method or the other.
Of course, if the errors are in two burst separated by a distance
greater than 16 bits, the information in the CRC will be of little
use since it no longer holds information that pinpoints the error.
If you know what one of those errors is, the other can be located.
An error in the CRC is also a total loss.
One last trick that can be thrown in is that if the data was ASCII,
the MSB can help to decide how well the error fit did. Even for
some code sequences, there are certain codes that are either illegal
or rarely used. These can also help to evaluate the correctness
of an error patch.
If anyone is interested I'll compile a list of the spacings between
different small burst up to 5 bits. One can look at their first
error position and decide what other errors could have caused the
same error syndrome in CRC16. I think this is a lot better than
doing it by experimentation.
Dwight