PDP-11 Diagnostic Program Listings Micro Fiche Scans
Paul Koning
paulkoning at comcast.net
Sat Nov 20 12:38:18 CST 2021
> On Nov 20, 2021, at 11:15 AM, Jon Elson via cctalk <cctalk at classiccmp.org> wrote:
>
> On 11/20/21 1:30 AM, Joerg Hoppe via cctalk wrote:
>> Hi Friends,
>>
>> Micro fiche scans of the PDP-11 XXDP listings are online now:
>
> Wow, took a quick look. The scans are likely not good enough to run through an OCR program, but certainly good enough to read through when trying to understand what a program is doing.
I only tried tesseract once, years ago, and it wasn't useful at all for the particular material I gave it. Quite possibly it's better now.
Instead, I ended up buying a commercial OCR program, "Fine Reader" from ABBYY, which has served me well. I used it to read CDC 6600 wire list scans, which it did well. I also tried to make it do the THE source listings in the Knuth archive; those are hopeless for OCR partly due to the overprinting convention used, and required manual entry.
So... it might be worth a try feeding some of those images to current commercial OCR programs. FineReader has a "learn" capability that does a decent job of making it deal with the peculiarities of a particular piece of source material.
paul
More information about the cctalk
mailing list