It looks to me like the format which offers the most hope of providing for
users of formats other than PDF is the TIFF output from the scanner. That
can be compressed using a commonly available tool like pkzip and it's
already pepared in the process of getting from a paper document to PDF.
Each page will be more or less as it was scanned, though there may be some
noise specks. No extra effort is involved aside from naming and storing
each one, then transferring the whole mess to the web site host.
If Tony wants to OCR it, he can, if SAM wants to skip the schematics, he
can, and pretty much anyone else who wants to do anything else can do that
too. Since a typical scanned page in raw bitmap is about 1 MB, though I
don't know how much compression can be squeezed out of a page of text or a
page of line-art, server capacity may become an issue. However, it may be
realistic to reduce the TIF-formatted files to CD, assuming there's a CD
available on the web host, and then it can deal with the TIF files.
DICK
-----Original Message-----
From: CLASSICCMP(a)trailing-edge.com <CLASSICCMP(a)trailing-edge.com>
To: Discussion re-collecting of classic computers
<classiccmp(a)u.washington.edu>
Date: Tuesday, June 08, 1999 4:50 PM
Subject: Re: Disk Drive Documents
But, and this
applies to text as well, once you've printed them out and
scanned them back in again, you've lost that structure. The scanner
produces a bitmap (probably a slightly distorted bitmap as noting is
perfect). It is _very_ difficult to automatically recover that structure
(would you like to write a program that analyses a bitmap and finds
component symbols in it?)
Such programs do exist, however. Most of the fancy electronics-oriented
CAD packages have "schematic capture" modules. And, of course, such
packages run into the many kilobucks.
--
Tim Shoppa Email: shoppa(a)trailing-edge.com
Trailing Edge Technology WWW:
http://www.trailing-edge.com/
7328 Bradley Blvd Voice: 301-767-5917
Bethesda, MD, USA 20817 Fax: 301-767-5927