Interesting.
This reminds me very strongly of "Travels in Computerland" by Ben
R. Schneider. He went through this sort of struggle back around 1973,
trying to figure out how to convert several thousand pages of
reference book into a computer database. The eventual solution was to
type it and then scan the typed pages.
Nowadays the answer would be different, but it's curious that it
hasn't changed as much as I would expect.
paul
I think part of the problem might be a bit of lack of understanding
of the problem. I'm waiting for more details, but from the initial
details I've gotten it sounds as if they view formats that are as
little as 8 years old as being unreadable. I suspect part of the
reason they don't think it can be OCR'd is due to the page layouts.
Of course then there is a book that I provided the computer support
for. It started on a Kaypro PC (something like an IBM AT clone), and
finished on a P90 laptop. In the end a 1200dpi printer had to be
purchased as the book was written in WordPerfect (mostly WordPerfect
for Windows 5.2, which is the format it was finished in), and by the
time it was finished no one could handle the format. The book was
printed at 1200dpi and scanned, then the scans were used for the
printing. The big issue was the high degree of formatting (it is
almost totally in columns, and somewhere over 1200 pages), and yes, I
tried to get the book shifted to a more modern program back when it
could have been done.
Zane
--
--
| Zane H. Healy | UNIX Systems Administrator |
| healyzh(a)aracnet.com (primary) | OpenVMS Enthusiast |
| | Classic Computer Collector |
+----------------------------------+----------------------------+
| Empire of the Petal Throne and Traveller Role Playing, |
| PDP-10 Emulation and Zane's Computer Museum. |
|
http://www.aracnet.com/~healyzh/ |