Toby,
>One thing to consider is how the format deals with damage....
Good thought. I wonder if that integrity functional role could be delegated
to the container format rather than the payload element. The payload
doesn't get internally marked up with checksum blocks, but we rely on
LZW/LZW2/other as the guarantor of file integrity. The rest of the
scavenging is done by the metadata/descriptor element (like card 1 = byte
[xxx]-byte[yyy]). Any damage or ambiguity is noted in this external
metadata rather than the actual capture blob. The container ensures that
the integrity of the contents are the same as when they were created. The
descriptor describes what was known at creation time. Workable?
Regards,
Colin Eby
> Date: Thu, 1 Sep 2011 13:00:24 -0400
> From: Dave McGuire <mcguire at neurotica.com>
>
> On Sep 1, 2011, at 12:52 PM, Glen Slick <glen.slick at gmail.com> wrote:
>> I haven't noticed many "rabid Microsoft fanboys" on this list,
>
> Oh, there are definitely one or two. I'll be nice today, though. ;)
>
Hello? <wiping froth from chin...>
All -
Thought I'd de-cloak from lurk mode long enough to canvas opinion or
archival formats (again). I'm considering this from a digital media (not
paper to digital) at the moment. Considerations for paper and software are
similar though. There are many dozen image formats for diskettes and disk
drives, just as there are for photographic and paper images. My thought was
to avoid the problem by using a non-format. Stay with me on this...
basically persist the data in as raw a format as possible, with an
externalised, self defining descriptor, all wrapped up in an open archive
format to form a single file. The concept being that the long term storage
format is just a not the viewer format. You maintain a converter from the
storage format to a current viewer format, but you don't actually store the
data in the current viewer format. By current viewer, I mean PhotoShop,
Acrobat, etc. Here's the basic file layout:
xxxxx.zip :
raw.bin (a simple sequential byte copy)
descriptor.xml ( instructions on how to carve it into sectors, raster
lines etc.)
*Don't get hung up on *.zip... I know LZW is encumbered... this is just an
example.
The actual scan output is the raw.bin. It no formatting data at all. The
descriptor tells you everything you recorded about the file at scan time
which helps you to interpret the data stream later. That could be ratster
format, page markers, etc. This should work for a broad range of data for
an already digital data format.
The problem I have with this approach is where a scan doesn't result in
actual digital information. Yeah, ok it's all bytes on disk, but what I
mean is sampled data versus byte for byte representation. Paper tape and
punch cards are examples of data which is unambiguous in a digital sense,
because it contains a fixed encoding stream, with a self evident data word
boundary. A card is one character a line, tape one character per linear
unit. All you have to do is store the whole thing as a byte encoded stream,
and put the metadata in the descriptor (card 1, card 2, etc...).
Where this falls down is say for instance, an MFM diskette scanned with a
sampling board like a CatWesel. In this case the data returned is
basically analogous to sampled analog signals. It's clock ticks between
fluctuations. The fluctuations are binary. But their interpretation is
based on sampling rate and media rotation. You can sample at higher and
higher rates and get more sampling data, if you are working blind. This is
less of a problem if you know something about the subject. But otherwise
discretion has to be applied to interpret what you've captured. That
worries me. I like deterministic outcomes, especially in archival work.
You could force an 8 bit boundary on the resulting data, but things like
sector headers are sometimes deliberately encoded in fluctuation sequences
that don't conform to rest the data encoding. That throws your
interpretation of the data off, unless you already know what the sector
header formats look like. The only way I can think of to follow the
inside-out / no proprietary archival format approach you basically don't
transpose to bytes at all, but leave it as sampled fluctuations... not even
bits. As with other viewer approaches, you apply the transposition only
when you attempt to view the data, not when you store it for archival
purposes. Thoughts?
Third example... a booked scanned becomes a byte stream of bit map data
stored in the raw.bin. The descriptor then encodes the page transitions and
raster format, plus discretionary metadata. I like that much better than
PDF, TIFF and JFIF. I realise these formats are unlikely to die out, but I
like the idea of a common archival format which is self documenting better.
I'd be interested in hearing people's (non-flame) comments.
Regards,
Colin Eby
> What I need now is to borrow or buy an M9058 to test with, and ideally a
> qbus extender so I can use an oscilloscope to work out which component is
> failing.
>
> Regards
>
> Rob
All the M9058 is getting from the box is power - you can easily remove it from
the chassis, supply power directly from an external supply (solder temporary
feed leads above the board fingers if you don't have the right connector). Makes
testing a lot easier.
Jack
All --
I got an email from someone in Landrum, SC (close to Greenville and
Greer) regarding an Intel MDS system with disks, manuals and the 8080 pod. I
don?t have the room for it and the person doesn?t really want to ship it, so
I thought I?d mention it here. If anyone?s interested, contact me off-list
for the person?s contact info.
Rich
--
Rich Cini
Collector of Classic Computers
Build Master and lead engineer, Altair32 Emulator
http://www.altair32.comhttp://www.classiccmp.org/cini
>Message: 6
>Date: Tue, 30 Aug 2011 13:52:48 -0500
>From: Jules Richardson <jules.richardson99 at gmail.com>
>To: General Discussion: On-Topic and Off-Topic Posts
>????????<cctalk at classiccmp.org>
>Subject: Re: mess.org
>Message-ID: <4E5D3180.4010903 at gmail.com>
>Content-Type: text/plain; charset=ISO-8859-15; format=flowed
>
>Sridhar Ayengar wrote:
>> Adrian Stoness wrote:
>>> and that means what?
>>
>> You see how my text is below yours? ?That's the way it should be. ?You
>> are putting your text above the text to which you are replying.
>
>Agreed... although ordering is a funny thing - someone should really tell
>US road maintainers so that they cease writing things like "ahead stop" on
>their roads ;-)
You're not the only one who thinks that way: http://xkcd.com/781/