On Wed, 11 Aug 2004, Jules Richardson wrote:
Maybe zip's not the ideal example. My point really
is that if the
archives are enormous, people are going to be tempted to compress them.
If they compress them, what guarantee is there that a) the compression
method is going to be around when someone totally unrelated wants to
handle these files in x years, and b) is it going to be obvious to
someone in x years what compression method was even used to compress the
file?
Yep. Good points. Primarily, the human psychology aspect: large file
sizes will compel people to want to compress the images, quite possibly
ruining the effort of making the images to begin with. The spec should
be designed such that it allows for the smallest filesize possible.
Again, it's back to longevity of the archives
themselves. If something's
needed for the short term (next ten years, say), it's not a problem. But
it'd be nice if a future generation, upon discovering one of these
archives, could know exactly what it was (and stand a good chance of
decoding it) just by looking at it (hence the human-readable part)
Right. This is the whole reason for designing a spec like this.
Again, I don't like the idea of anything happening
to the archive files
after creation though. I suppose the data from the raw device (floppy,
hard disk, whatever) within the archive could be encoded somehow
(leaving the config section as plain-text) - providing it's in a common
enough format that we think someone will be able to find the spec for
the encoding method in x years and so be able to get at the data. That's
somewhat hard to say for sure though!
Or at least be able to figure it out. Encoding data in a wider base, such
as in hex or Base64, still allows a smart human to figure it out. If we
add meta compression, this will also need to be readily decodable by a
smart human.
Again,
producing paper copies of stuff with non-printable characters
becomes "problematic".
That's actually an extremely good point, and perhaps the best argument
(IMHO) for not using binary data so far :-) Hmmm...
Agreed.
Seriously, if there's a good argument for having
CRC's in more than x
(50?) percent of cases because corrupted data expected to be a real
possibility, then make them mandatory. If not, then make them an
optional extra. I certainly can't see a good reason why they'll *never*
be needed, that's for sure.
I'd make them an optional extra, with the default assuming no CRCs. In
fact, the spec should be designed in such a way that as litte as possible
is assumed. Any encoding features should have to be explicitly invoked in
the image header.
With you on the longevity side of things. Hmm, off the
wall suggestion,
but it's only the storage format for the raw data that's an issue,
right? So does it make sense to define both binary and ASCII
representation as valid storage formats, and the format in use within a
particular archive is recorded as a parameter within the human-readable
config section?
I still don't like it. As Roger M. pointed out, what will the binary data
look like after it's been paraded through several different platforms?
In this way those wanting compact archives to save
space, run against
various existing utilities etc. can have them containing binary data;
those who think they need ASCII representation of the data due to tool
or transmission medium limitations can use that format - all whilst
maintining compatibility with the spec. (potentially the 'encoding
method' parameter could include other defined types - uuencode, base64
etc. but let's not get ahead of ourselves...)
I'd rather give the option of being able to specify which text-based
encoding scheme was used (i.e. base16, base64, etc.)
(funny how someone mentioned IFF files earlier; I keep
on thinking of
TIFF images where the data's structured and the format both versioned
and maintained under strict control)
So it should be with this specification.
--
Sellam Ismail Vintage Computer Festival
------------------------------------------------------------------------------
International Man of Intrigue and Danger
http://www.vintage.org
[ Old computing resources for business || Buy/Sell/Trade Vintage Computers ]
[ and academia at
www.VintageTech.com || at
http://marketplace.vintage.org ]