It is unlikely that no current day OCR will produce an error free listing.
It is possible to train an AI to do this but it requires specific training. It must be on
the specific machine code and on the same format. Any generic OCR will have many errors if
the text is hard to read.
The final product must include notes as to things it is not sure about or it would be
useless. I recovered a listing for the 4004 processor that was printed on a ASR33 with
ruts on the platen. The right hand 1/4 of letters were missing at several locations across
the page. Letters such as F and P, as well as 0 and C were often not well enough printed
to distinguish.
Luckily F and P were often in context relatively easy to determine but 0 and C were often
use to describe a HEX number. Unlike the text on this page, the differences were not
always obvious. The final result in working code required noting which things were
possibly one or the other. The only way to determine most of these was by using a
simulation of the code. Most all the cases for the 0 vrs C were that it was a 0, as these
were for initializing a pointer base number ( context of usage ). In one case it was only
through the simulation was I able to determine that it was really CC and not 00.
Marking locations of uncertainty was essential to determine where to check the program
code context.
Any OCR that doesn't include possible options and that isn't trained on that
particular code is worthless.
Dwight
________________________________
From: cctalk <cctalk-bounces at classiccmp.org> on behalf of Noel Chiappa via cctalk
<cctalk at classiccmp.org>
Sent: Sunday, January 23, 2022 9:31 AM
To: cctalk at
classiccmp.org <cctalk at classiccmp.org>
Cc: jnc at
mercury.lcs.mit.edu <jnc at mercury.lcs.mit.edu>
Subject: Re: Typing in lost code
From: Gavin Scott
I think if I had a whole lot of old faded greenbar
etc. ... Someone may
even have done this already
See:
https://walden-family.com/impcode/imp-code.pdf
Someone's already done the specialist OCR to deal with faded program listings.
Noel