On 1/18/2012 6:24 AM, Camiel Vanderhoeven wrote:
Thanks to everyone who has replied with suggestions.
Another suggestion I
got is to try Omnipage, so I got myself a 15-day trial version of Omnipage
Pro. It allows you to draw a rectangle on the page, then select "Numeric
data". It will then only perform OCR on that part of the page, and will try
to recognize everything as a number. When in doubt, it will bring up a
dialog with a detailed view of the problematic page so you can make manual
corrections. Combined with a bit of manual work to fix things up, this seems
to work fairly well.
I realize you may have already got your answer, but abbyy finereader pro
works very well.
It also has an option for numeric only data, including the ability to
limit which numeric characters to include(ie, only 0 & 1, if you
wanted). You can also train it to do pattern recognition, if the
recognition of the font doesn't work naturally well.
With most of these programs, the quality of your original source
material seems to be the determining factor on how much manual work
you'll need to do afterwards to correct errors.
BTW: what's the end format?
http://www.abbyy.com/
I'm unrelated to company......but it does sound like a neat problem.
Cheers,
Camiel
Good luck,
Keith