to clarify my comment about using zip, this was suggested as a way for a person to reduce
the size of their archive for their own purposes. I was not proposing that data be
compressed with zip for the actual archive file.
The only assumption I made for data continuity was that the data needed to be ascii and to
have some error detection ability to let the person accessing the data have an idea of
validity.
Keeping an image file in an error checked ascii file in a purely sequential form seems to
me to not rely on any technology or special information other than accessing a pure data
file from some type of computer system. As someone else suggested, ascii text could even
be contained in archive files to explain how to get teh data back. My thought of having
the data accessible as well as the formatting information would allow a single
transmission of a group of files to be sent with information contained within to either
get just the data or be able to actually create media copies.
best regards, Steve Thatcher
-----Original Message-----
From: Doc Shipley <doc(a)mdrconsult.com>
Sent: Aug 11, 2004 10:20 AM
To: General(a)mdrconsult.com, Discussion@mdrconsult.com@null,
On-Topic and Off-Topic Posts <cctalk(a)classiccmp.org>rg>, null@null
Subject: Re: Let's develop an open-source media archive standard
Jules Richardson wrote:
On Wed, 2004-08-11 at 13:13, Steve Thatcher wrote:
I would encode binary data as hex to keep
everything ascii. Data size would expand,
but the data would also be compressable so things could be kept in ZIP files of
whatever choice a person would want to for their archiving purposes.
"could be kept" in zip files, yes - but then that's no use in 50 years
time if someone stumbles across a compressed file and has no idea how to
decompress it in order to read it and see what it is :-)
There's going to have to be _some_ assumption of continuity or this
project is hopeless. It's my opinion that short-term or long-term,
anyone who wants or needs access to the data** will have some historical
understanding of the computing environment contemporary to that data.
Provisions for changing technology and loss of continuity are good,
but we still need to draw the line at some point, especially concerning
"external" archival of the archived data, i.e. zipped arcives and
storage media. To go to an extreme example, even printed ASCII on paper
or mylar isn't reliable. Paper may be just as archaic as hieroglyphics
when the data is wanted.
We have to depend on ongoing maintenance of these archives. If they
are not periodically migrated to current media, and if the
attached/imbedded documentation is not augmented to account for social
and technical "loss of memory", future retrieval will be difficult, if
not impossible, no matter what we do now.
Doc
** Technically speaking. We can't provide for, say a non-technical
attorney who wants recorded files as evidence. That example will end up
in the hands of someone like us.