On Tue, 2005-05-17 at 17:30 -0500, Randy McLaughlin wrote:
From: "Brian Wheeler" <bdwheele at
indiana.edu>
Sent: Tuesday, May 17, 2005 5:06 PM
As coincidence would have it, I work at Indiana
University's Digital
Library Program and there was a lecture on archiving audio which hits
many of the same issues that have come up here. The conclusions that
they came up with for the project included:
* There's no such thing as an eternal media: the data must be
transportable to the latest generation of storage
* Metadata should be bundled with the content
* Act like you get one chance to read the media :(
While this is a different context, the principle is basically the same.
I've got a pile of TK50 tapes I'm backing up using the SIMH tape format,
so this is relevant to that process as well.
I think the optimum format for doing this isn't a single file, but a
collection of files bundled into a single package. Someone mentioned
tar, I think, and zip would work just as well. The container could
contain these components:
* content metadata - info from the disk's label/sleeve, etc
* media metadata - the type of media this came from
* archivist metadata - who did it, methods used, notes, etc
* badblock information - 0 blocks which are actually bad.
* content - a bytestream of the data
I don't think there's any real need to document the physical properties
of the media for EVERY disk archived -- there should probably be a
repository of 'standard' media types (1541's different-sectors-per-track
info, FM vs MFM per track information, etc) plus overrides in the media
metadata (uses fat-tracks, is 40 track vs 35, etc).
Emulators could use the content part of the file as-is and collectors
would have enough information to recreate the original media. It would
also allow for cataloging fairly easily.
Brian
<snip>
I disagree on a few points:
Today we know what the 1541 structure is, we need enough detail to explain
it to future users.
The differences between FM and MFM are not as simple
as a binary decision.
RX02 is one example of mixed formatting, even with FM & MFM each
implimentation can be fairly unique (hard vs. soft sectored, sector size,
flux density, etc).
I guess what I was getting at is there should be a library of standard
types which fully define the format. 1541's look the same 99% of the
time unless half-tracks, fat tracks, or another copy protection scheme
was used. So if there's a library that fully defines what a 1541 _is_,
there's no reason to have that exact definition copied for each disk
archived. Not that it really takes up that much space, but it does make
it more tedious -- do you want to enter the track/sector geometry for
every disk you copy?
Most of the exact details can be understood by using
current knowledge but
maybe not 50 years from now when someone is trying to understand it.
True, but I suppose that's why we're discussing it now :)
One thing can be that for a given format part of the
overall archive should
include technical details:
That is to say one example would be a site like asimov should include
technical information on the apple disk interface as well as an explaination
of how the dsk images are created and restored. It isn't necessary to
include the details with every dsk file but withing the general archive.
Agreed. If there's an archive with the 'bundling software' which
handles the meta data, there should be a library of definitions there
that can be copied to each user's archive as needed.
Brian