I haven't read the entire thread on this but I did read Steve Thatcher's
idea and it describes about where I was coming out on this myself.
I might have missed what the ultimate use of this archive would be. Will the
archive be used to (1) re-generate original media; (2) operate with
emualtors; (3) both?
To ensure integrity of the data I would propose recording the data in the
Intel Hex format -- it's text-based and has built-in CRC. Now, we'd have to
modify the standard format a bit to accommodate a larger address space and
to add some sort of standardized header (a "Hardware Descriptor"). This data
would be used by the de-archiver to interpret the stream of data read from
the data area (the "Hex Block").
I agree that a multi-layer approach offers the best combination of platform
neutrality and portability. I don't really know if we need two or three
layers as Steve described to describe the file in a standard fashion. Using
an Intel Hex-like format would increase the "de-archiving" time, but in my
view it's a fair trade-off. De-archiving software could translate the
platform-neutral file into another format better suited for use in
emulators.
I think that we should start compiling a list of the various media we want
represented and how that media is organized natively. I don't mean "well, it
has blocks and sectors" either. We should examine the exact format down to
the actual numbers (i.e., "2048 blocks of 256-bytes recorded twice"). Seeing
how the various data stores are organized should bring some clarity to how
we should represent it.
Just my $0.02.
Rich