OCR old software listing

Larry Kraemer ldkraemer at gmail.com
Mon Dec 31 06:20:48 CST 2018


I used the libtiff-tools (Debian 8.x - 32 Bit) to extract all 61 .TIF's
from the
Multipage .tif file.  While the .tif's look descent, and RasterVect shows
the
.tif properties to be Group 4 Fax (1bpp) with 5100 x 6600 pixels - 300 DPI,
I can't get tesseract 3.x, TextBridge Classic 2.0, or Irfanview with KADMOS
Plugin to OCR any of the .tif files, with descent results.  I'd expect an
OCR
of 85 to 90 % correct conversion to ASCII text.

Typically, one of the three above Software packages will do a descent job
of OCRing .tif's of such scans.  (Most PDF's end up at 72 x 72 DPI, and
converting them to 300 DPI, allows them to be properly OCR'd.)

If anyone else has had better luck, I'd like to know what your process is.

Thanks.

Larry


More information about the cctech mailing list