OCR old software listing

31 Dec 2018

On 2018-12-31 7:20 AM, Larry Kraemer via cctalk wrote:
...
  I used the libtiff-tools (Debian 8.x - 32 Bit) to
extract all 61 .TIF's
 from the
 Multipage .tif file.  While the .tif's look descent, and RasterVect shows
 the
 .tif properties to be Group 4 Fax (1bpp) with 5100 x 6600 pixels - 300 DPI,
 I can't get tesseract 3.x, TextBridge Classic 2.0, or Irfanview with KADMOS
 Plugin to OCR any of the .tif files, with descent results.  I'd expect an
 OCR
 of 85 to 90 % correct conversion to ASCII text.
 Typically, one of the three above Software packages will do a descent job
 of OCRing .tif's of such scans.  (Most PDF's end up at 72 x 72 DPI, and
 converting them to 300 DPI, allows them to be properly OCR'd.)
 If anyone else has had better luck, I'd like to know what your process is. 
I don't know if OCR software is sensitive to having correct resolution
(I've practically zero experience with it), but 300 dpi seems wrong for
Mattis' scans.
Seems they should be 600 dpi (21.7 cm x 28 cm).
--Toby
...

 Thanks.
 Larry

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

OCR old software listing