Rich,
> You could force an 8 bit boundary on the resulting
data, but things
> like sector headers are sometimes deliberately encoded in
> fluctuation sequences that don't conform to rest the data encoding.
That's hardly deterministic, and would certainly
not work on, for
example, a disk written by a PDP-10 (36 bit words represented as pairs
of 18 bits + parity), to take a popular example. There *are* no
deterministic outcomes, especially in archival work. There is only
interpretation.
Precisely, yes. But even with byte encoding of bitstreams you have an
endian problem. At some point the capture system imposes its personality
on the process and you simply have document what you've done so the
upstream ('viewer'/'accessor') toolset can take it into account in post
processing. The adding of metadata context is what contributes to the
deterministic outcome, rather than attempting to force raw capture into a
rigid format.
This is Brian Zuzga's 1995 undergraduate thesis at
MIT on a project to
archive the backups done at the MIT AI Lab using what they named "the
Time Capsule File System". Nihil novi sub sole (Ecc. 1:9-10).
Thanks for the reference. I must have come across it before at some point
as it seems very familiar. But it wasn't a work I was referencing yet in
preparation. I'm studying it closely, along with things like the UPF.
What strikes me is how there are several originating documents around
effort followed by long silences and lack of back references. It may be
down to the relatively cursory nature of my work so far, but it seems
adoption of these formats and processes has been either quiet or limited.
Is that a misapprehension on my part? Any other efforts you think I should
investigate?
-- Colin