Larger than the RT-11v4 set (same blue binders), much
smaller than an
>Orange Wall. I think there are two cartons of binders, so call it 12
>volumes of several hundred pages each.
OK, definitly bigger than I was thinking. Though when you say an Orange
Wall I guess you're talking VAX/VMS V4 not RT-11 V5.x (which unless I just
miscounted is 9 volumes).
I was thinking about these very docs this weekend and
the other thread
about OCR and scan densities, etc. I'm
wondering if it'd be worth it
to OCR something like this and, rather than
storage as flat text or
HTML, attempt to push the text back into RUNOFF. Does anyone have a
RUNOFF clone in perl? I presume that excepting strange dependence on
odd, undocumented behavior (i.e., plain, by-the-book usage), it
wouldn't be difficult to make a state-based RUNOFF engine in perl.
Something to consider. If you scan it in at 300-600dpi into Adobe Acrobat
files you produce a usable copy without any errors (unless you should miss
a page). OTOH, if you OCR it you're adding a LOT of work *and* adding the
large possibility of errors creeping in.
Zane
--
| Zane H. Healy | UNIX Systems Administrator |
| healyzh(a)aracnet.com (primary) | OpenVMS Enthusiast |
| | Classic Computer Collector |
+----------------------------------+----------------------------+
| Empire of the Petal Throne and Traveller Role Playing, |
| and Zane's Computer Museum. |
|
http://www.aracnet.com/~healyzh/ |