Recovering the ROM of an IBM 5100 using OCR (among other things)

dwight dkelvey at hotmail.com
Thu Jun 27 09:46:17 CDT 2019


I love the walk through things. I'd clearly have found a wired, digital, method of doing it ( printer port or such ).
I had a similar problem. I was recovering 4004 code printed out with what looked like a ASR33 print. I did it manually. On looking at the data, I suspect the platen had ruts as the pdf image had faded columns. Most of the letter text was for labels or comments. These were easy to patch things like P and F or E and B. The harder one was C and 0.  The program mostly used decimal but when specifying 4004 registers data, it used for the SRC instructions or nibble data, they were in HEX. C and 0 were used quite often. I was able to find what I believe were all the errors by emulating the 4004 code and finding errors in the operation. I recall finding the last error that was in the display output routine ( related to placement of the decimal point ). I'd put "00" where the original code was "CC". 99+% of the "CC" in the rest of the code were really "00". Most mixed were either "0C" or "C0" so it seemed justified to be "00". It was the only location that "CC" existed in the entire code.
Even the best OCR could not have done as well as a human that understood what the intent was. Understanding the redundancy in the code is a valuable attribute that a human has that would be difficult for a learning program to pick up. I've used similar thinking to fix cassette tape data that had dropouts. It was BASIC code, although tokenized. The redundancy of the good parts of the data made filling in the missing parts easier.
Dwight

________________________________
From: cctalk <cctalk-bounces at classiccmp.org> on behalf of Liam Proven via cctalk <cctalk at classiccmp.org>
Sent: Thursday, June 27, 2019 4:55 AM
To: Discussion: On-Topic and Off-Topic Posts
Subject: Recovering the ROM of an IBM 5100 using OCR (among other things)

This is *epic*.

https://github.com/stepleton/5100NonExecutableROSDecode/blob/master/WRITEUP.md

--
Liam Proven - Profile: https://about.me/liamproven
Email: lproven at cix.co.uk - Google Mail/Hangouts/Plus: lproven at gmail.com
Twitter/Facebook/Flickr: lproven - Skype/LinkedIn: liamproven
UK: +44 7939-087884 - ČR (+ WhatsApp/Telegram/Signal): +420 702 829 053


More information about the cctalk mailing list