Typing in lost code

23 Jan 2022

I've run into that situation too, with listings so difficult that even a commercial
OCR program (FineReader) couldn't handle it.  At the time Tesseract was far less
capable, though I haven't tried it recently to see if that has changed.
Anyway, my experience was that the task was hard enough that it needed someone with
knowledge of the material.  It may be a contract typist could do a tolerable job but I
have my doubts.  Typing, say, an obsolete assembly language program if you see it merely
as a random collection of characters is going to produce more errors than if the person
doing the typing actually understands what the material means.
One consideration is the effort required to repair transcription errors.  Those that
produce syntax errors aren't such an issue; those that pass the assembler or compiler
but result in bugs (say, a mistyped register number) are harder to find.
        paul
...
  On Jan 22, 2022, at 8:57 PM, Mark Kahrs via cctalk
<cctalk at classiccmp.org> wrote:
 No, OCR totally fails on olde line printer listing.  At least the ones I've
 tried (tesseract, online, ...)
 On Sat, Jan 22, 2022 at 8:06 PM Ethan O'Toole <ethan at 757.org> wrote:
>
> Can the listings be OCR'ed?
>
>                        - Ethan
>
>
>> Has anyone ever used Amazon Mechanical Turk to employ typists to type in
>> old listings of lost code?
>>
>> Asking for a friend.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Typing in lost code