I recently dealt with this with the DaJen SCI monitor listing out of the manual. The copy
is pretty bad, and either their printer was having issues, or slashing of "zero"
vs "O" was inconsistent somehow. OCRing it produced more of a mess than just
sitting with the original and a text editor open side-by-side.
I can't imagine it would've worked out well to have someone who wasn't
familiar with 8080 assembly language transcribe it, I had a rough enough time on my own,
and ended up having to compare the assembly output to a known-good ROM dump to get the
last of the discrepancies out.
Thanks,
Jonathan
??????? Original Message ???????
On Sunday, January 23rd, 2022 at 10:11, Paul Koning via cctalk <cctalk at
classiccmp.org> wrote:
I've run into that situation too, with listings so
difficult that even a commercial OCR program (FineReader) couldn't handle it. At the
time Tesseract was far less capable, though I haven't tried it recently to see if that
has changed.
Anyway, my experience was that the task was hard enough that it needed someone with
knowledge of the material. It may be a contract typist could do a tolerable job but I have
my doubts. Typing, say, an obsolete assembly language program if you see it merely as a
random collection of characters is going to produce more errors than if the person doing
the typing actually understands what the material means.
One consideration is the effort required to repair transcription errors. Those that
produce syntax errors aren't such an issue; those that pass the assembler or compiler
but result in bugs (say, a mistyped register number) are harder to find.
paul
> On Jan 22, 2022, at 8:57 PM, Mark Kahrs via cctalk cctalk at
classiccmp.org wrote:
>
> No, OCR totally fails on olde line printer listing. At least the ones I've
>
> tried (tesseract, online, ...)
>
> On Sat, Jan 22, 2022 at 8:06 PM Ethan O'Toole ethan at
757.org wrote:
>
> > Can the listings be OCR'ed?
> >
> > - Ethan
> >
> >
> > > Has anyone ever used Amazon Mechanical Turk to employ typists to type in
> > >
> > > old listings of lost code?
> > >
> > > Asking for a friend.