Jules Richardson wrote:
On Wed, 2004-08-11 at 13:13, Steve Thatcher wrote:
I would encode binary data as hex to keep
everything ascii. Data size would expand,
but the data would also be compressable so things could be kept in ZIP files of
whatever choice a person would want to for their archiving purposes.
"could be kept" in zip files, yes - but then that's no use in 50 years
time if someone stumbles across a compressed file and has no idea how to
decompress it in order to read it and see what it is :-)
There's going to have to be _some_ assumption of continuity or this
project is hopeless. It's my opinion that short-term or long-term,
anyone who wants or needs access to the data** will have some historical
understanding of the computing environment contemporary to that data.
Provisions for changing technology and loss of continuity are good,
but we still need to draw the line at some point, especially concerning
"external" archival of the archived data, i.e. zipped arcives and
storage media. To go to an extreme example, even printed ASCII on paper
or mylar isn't reliable. Paper may be just as archaic as hieroglyphics
when the data is wanted.
We have to depend on ongoing maintenance of these archives. If they
are not periodically migrated to current media, and if the
attached/imbedded documentation is not augmented to account for social
and technical "loss of memory", future retrieval will be difficult, if
not impossible, no matter what we do now.
Doc
** Technically speaking. We can't provide for, say a non-technical
attorney who wants recorded files as evidence. That example will end up
in the hands of someone like us.