Philip Pemberton wrote:
I've just been thinking about the problem of
imaging/copying MFM hard
discs... I've run the numbers, and this seems right to me, but I'd
appreciate a sanity check.
I'm on the wrong side of coffee for this, but...
Let's say you have an ST-412 drive you want to
read. You've also got a
Catweasel or something like it, that has somehow had an extra bunch of
I/O lines added to it (to control the head select lines, etc. on the HDD).
Well, extra lines isn't a big deal I suppose - heck, you could even do that
bit 'manually' and read platters one at a time, flipping a switch between each
pass.
An ST-412 rotates at 3600 RPM. That's (3600/60)=60
revolutions per
second, or 1/60 = 16.66(recurring) milliseconds per revolution.
OK (I think I ran some numbers on this here on the list a couple of years ago,
but it's probably quicker to just reply now than dig the message out :-)
The drive's data rate is 5 megabits per second,
but could be lower (or
indeed higher). But the spec says 5Mbps, and for the sake of argument
I'm going to stick with that...
(5Mbps/1000) = 5Kbits per millisecond.
5Kbits * 16.667 = 83.35 kilobits per track, absolute maximum.
Well, run some figures as a sanity-check... say 256*8 bits per sector, 32
sectors/track on a formatted ST506 - 32 x 256 x 8 = 64Kb of actual data per track.
83.35 seems sensible as some kind of theoretical maximum, given storage of
sector headers and the like.
A Catweasel records the data from a disc by measuring
the time between
flux transitions.
I'm not sure what sort of granularity is needed, though - presumably the CW
uses counters geared toward expected floppy rates, so most likely can't
represent timing gaps accurately enough for hard disk speeds?
So hypothetically, if your "Catweasel or
something like it" had
128Ksamples worth of buffer RAM, and enough I/O lines to drive the HDD,
you could read an MFM hard drive track-by-track and copy it onto another
drive of the same type?
I'm not sure about copying - but I think you could at least extract the data
via software, which is still a vital step forward.
Do my calculations look right?
With 304 cylinders and 4 heads, that works out to 304 * 4 * 83.35 =
101353.6 Kbits, which does seem awfully low to me...
101353.6 Kb is approx 98Mb, or approx 12MB - so that seems not unreasonable.
However, remember that's *sample count*, not total storage.
So, what's an upper boundary on bits per sample? I suppose you could have an
entire track with just one sample on it right at the end, so your upper
boundary is the 16.67ms figure.
More typically though samples are going to occur *far* more frequently - so to
cope with all possible situations you (in theory) need a very large sample
length (which in 99.9999% of cases is going to contain a lot of 0s in the
upper bits!)
Some possible approaches I can see:
1) Just have an absolutely colossal buffer with a large number of bits per sample
2) Use the first bit (or bits) of each sample as a flag to indicate the
resolution of the following sample data (essentially toss away lower bits for
lengthy samples)
3) Use the first bit (or bits) of each sample to indicate the bit-length of
the following sample data,
4) Have a short-length sample resolution geared toward 'ordinary' data and
flag any tracks causing sample timer overflows as being damaged.
Personally ISTR looking at the second and third options - the third is
probably the better of the two, but needs a bit more intelligence (and at the
time I was wondering if I could do this in pure logic without a CPU). I'm not
sure what approach the CW takes (but at floppy rates the amount of buffer
needed is a lot smaller, so maybe it just gets away with a fixed sample size
and large enough buffer).
The fourth option is just yuck - I'd really want an exact (as possible) copy
of what's on the drive, even if it's utter garbage.
For some reason when I looked at this I think I believed I could get away with
9 bits per sample including a flag bit dictating two different sample
resolutions (i.e. somewhere around 1Mb of buffer) - unfortunately all my notes
on this are stuck in storage right now though (plus ideally I'd go for a
microcontroller approach with a variable sample length I think)
Recreating the data onto another drive is a different matter, of course - but
I've always been more interested in salvaging existing data onto more modern
media for analysis.
cheers
Jules