From: "Brian Wheeler" <bdwheele at indiana.edu>
Sent: Tuesday, May 17, 2005 5:06 PM
As coincidence would have it, I work at Indiana
University's Digital
Library Program and there was a lecture on archiving audio which hits
many of the same issues that have come up here. The conclusions that
they came up with for the project included:
* There's no such thing as an eternal media: the data must be
transportable to the latest generation of storage
* Metadata should be bundled with the content
* Act like you get one chance to read the media :(
While this is a different context, the principle is basically the same.
I've got a pile of TK50 tapes I'm backing up using the SIMH tape format,
so this is relevant to that process as well.
I think the optimum format for doing this isn't a single file, but a
collection of files bundled into a single package. Someone mentioned
tar, I think, and zip would work just as well. The container could
contain these components:
* content metadata - info from the disk's label/sleeve, etc
* media metadata - the type of media this came from
* archivist metadata - who did it, methods used, notes, etc
* badblock information - 0 blocks which are actually bad.
* content - a bytestream of the data
I don't think there's any real need to document the physical properties
of the media for EVERY disk archived -- there should probably be a
repository of 'standard' media types (1541's different-sectors-per-track
info, FM vs MFM per track information, etc) plus overrides in the media
metadata (uses fat-tracks, is 40 track vs 35, etc).
Emulators could use the content part of the file as-is and collectors
would have enough information to recreate the original media. It would
also allow for cataloging fairly easily.
Brian
<snip>
I disagree on a few points:
Today we know what the 1541 structure is, we need enough detail to explain
it to future users.
The differences between FM and MFM are not as simple as a binary decision.
RX02 is one example of mixed formatting, even with FM & MFM each
implimentation can be fairly unique (hard vs. soft sectored, sector size,
flux density, etc).
Most of the exact details can be understood by using current knowledge but
maybe not 50 years from now when someone is trying to understand it.
One thing can be that for a given format part of the overall archive should
include technical details:
That is to say one example would be a site like asimov should include
technical information on the apple disk interface as well as an explaination
of how the dsk images are created and restored. It isn't necessary to
include the details with every dsk file but withing the general archive.
In the 1940's how many people were able to build a radio reciever with what
they found in the battle field (razor blade, safety pin, knife, wire,
headset), how many could today? The same applies to computer technology.
Randy
www.s100-manuals.com