OCR software for numeric (binary) data

17 Jan 2012

On 17 January 2012 08:50, Camiel Vanderhoeven <iamcamiel at gmail.com> wrote:
...
  Hi Everyone,
 I have a bunch of PDF files that contain the microcode listings for an IBM
 7201-02 CE (enhanced system/360 model 65), like this one:
 http://ibm360-console.wikispaces.com/file/view/QZ001.pdf. I need their
 contents for the emulator that drives my '65 control panel. Unfortunately,
 the OCR software I have tries to recognize English words, and makes
 gibberish out of them. I'm only interested in the 1's and 0's, so it would
 be wonderful if there was OCR software that you can tell only to look for
 0's and 1's (or have some bias towards recognizing characters as a 1 or 0.
 Is anyone here aware of such software, or can anyone recommend a program
 that might do a good job with these? 
I am not sure how helpful this answer will be but Tesseract
(originally a commercial HP product, now in Google Code) has training
files for different languages.  I have never modified them but my
guess, from looking at the documentation, is that you could make
training files for a binary language.
N.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

OCR software for numeric (binary) data