On Mon, 16 May 2005, Jim Leonard wrote:
No offense to anyone who may be part of the
project, but that's a pipe
dream. I think a better goal is to create conversion utilities that
perform cross-conversion to/from FutureKeep format, as that is much more
realistic.
I completely agree. Lessons from the past are simple to grok.
There is no complex data standard and certainly no media that has
survived long enough to be considered permanent.
My bias is towards the simple; at least then when it's
incompatible you have less raw data, and less-ambiguous work (you
hope).
The more the payload data is encumbered with organization,
formatting, compression, interpretation, etc the harder it is.
There's no universal approach, but if we're talking about mainly
floppies, a simple sector dump (note 1) with a hand-written
description of the organization would probably suffice.
An example:
* a byte stream copy (note 1) of the diskette image, say 256256
8-bit-bytes long. Even printed on paper. Who knows what OCR will
be like in 20 years?
* A scrap of paper upon which is written:
"Copied from a Shugart 801, 8", single-sided, soft-sectored,
WD1771 FM format. CP/M-80. 128 byte sectors, 26 spt, 77
tracks."
Dumb. Simple. Repeatable. Portable as anything will be after 1, 5,
10, 20 years.
NOTE 1: you can either rely on a byte today being a byte tomorrow,
or define a mapping that is unambiguous, such as a description of
the local file system's byte ordering, or use a text
representation of the numbers (disk as a string) with a Rosetta
Stone for ASCII, or define as decimal, etc.
Cruder is better. A representation of the diskette contents
consisting of:
0: 0
1: 255
2: 47
...
256253: 229
256253: 229
256255: 229
with the prose description above is more than enough to simply
recreate the diskette, whatever that means.
One big problem with staight dumps of data is that often the media is not
homogenous.
Cromemco requires that the first track of the floppy disk be written in FM
but the other tracks can be MFM.
NorthStar allows complete mixture of FM & MFM on a single disk.
Jade requires the first track of bootable disks be MFM but the rest can be
FM.
Commodore 1541 has different number of sectors for different tracks.
Some meta information is needed but as stated simply good written info may
be enough.
I like and prefer media images as straight data dumps but I want the
formatting information of the original media somewhere. I even want data
from media that is incomplete or has errors, also
documented.
In the past I've done cut and paste reconstruction of data from damaged
originals.
As for corrupted media bit dumps (including formatting bits not normally
saved in data dumps) can be a godsend, it may be possible to reconstruct
corrupted data streams by bit shifting.
Randy