OCR old software listing - test-drb@ccmp.vtda.org

31 Dec 2018

I used the libtiff-tools (Debian 8.x - 32 Bit) to extract all 61 .TIF's
...
 from the Multipage .tif file.  While the
.tif's look descent, and RasterVect shows
the
.tif properties to be Group 4 Fax (1bpp) with 5100 x 6600 pixels - 300 DPI,
I can't get tesseract 3.x, TextBridge Classic 2.0, or Irfanview with KADMOS
Plugin to OCR any of the .tif files, with descent results.  I'd expect an
OCR
of 85 to 90 % correct conversion to ASCII text.

Typically, one of the three above Software packages will do a descent job
of OCRing .tif's of such scans.  (Most PDF's end up at 72 x 72 DPI, and
converting them to 300 DPI, allows them to be properly OCR'd.)

If anyone else has had better luck, I'd like to know what your process is.

Thanks.

Larry