The general idea makes sense, but the details don't sound quite right.
Here's an oversimplified back-of-the-envelope calculation:
If you allow for 12-bit burst errors in a 512-bit block (you did mean
bits, not bytes, right?), it takes 9 bits of information (512 = 2**9)
to say where the error is, and 12 bits of information to say what the
data in the burst should have been. This uses 21 of our 32 bits of CRC
information, leaving 11 bits to help us be sure this isn't a spurious
correction. That is, if we do have an error that's something other than
a 12-bit burst, the probability should be about 2**(-11) = 1/2048 that
the CRC will be one of the 2**21 values that says the block has only a
12-bit burst error. You said 512/(2**32 - 1), which is 2**(-23).
One oversimplified part of this is that someone has to show that the CRCs
that indicate two different 12-bit burst errors in the same data never
collide. I presume that's been done, and that's where the magic number
12 comes from. Otherwise you could safely correct errors of 13 or more
bits, just with a somewhat higher probability of spurious correction when
you really got some other error.
Of course, with floppy disks we have sectors of 128 to 1024 *bytes*, not
bits, and the CRC is only 16 bits, not 32, so I don't think we can do
much correction. With a 1024 byte sector, it already takes 13 bits of
information to say where a 1-bit error is. So if we use a CRC16 to correct
it, we have about a 2^(-3) = 1/8 probability that if more than one bit
is in error, we'll make a spurious correction.
Tim Mann tim.mann(a)compaq.com
http://www.tim-mann.org
Compaq Computer Corporation, Systems Research Center, Palo Alto, CA