PDP-11 Diagnostic Program Listings Micro Fiche Scans

Paul Koning paulkoning at comcast.net
Sat Nov 20 12:38:18 CST 2021



> On Nov 20, 2021, at 11:15 AM, Jon Elson via cctalk <cctalk at classiccmp.org> wrote:
> 
> On 11/20/21 1:30 AM, Joerg Hoppe via cctalk wrote:
>> Hi Friends,
>> 
>> Micro fiche scans of the PDP-11 XXDP listings are online now:
> 
> Wow, took a quick look.  The scans are likely not good enough to run through an OCR program, but certainly good enough to read through when trying to understand what a program is doing.

I only tried tesseract once, years ago, and it wasn't useful at all for the particular material I gave it.  Quite possibly it's better now.

Instead, I ended up buying a commercial OCR program, "Fine Reader" from ABBYY, which has served me well.  I used it to read CDC 6600 wire list scans, which it did well.  I also tried to make it do the THE source listings in the Knuth archive; those are hopeless for OCR partly due to the overprinting convention used, and required manual entry.

So... it might be worth a try feeding some of those images to current commercial OCR programs.  FineReader has a "learn" capability that does a decent job of making it deal with the peculiarities of a particular piece of source material.

	paul




More information about the cctalk mailing list