DEC scanned documents for Bitsavers (message for Al Kossow)

24 Apr 2015

...
  On Apr 24, 2015, at 9:48 AM, Noel Chiappa <jnc at
mercury.lcs.mit.edu> wrote:
  From: shadoooo 
  I'm scanning at 600dpi grayscale, lossless
compression. 
 I've been scanning a few things too, and I found that 600dpi grayscale
 produced absolutely enormous files (many, many MB's per page, for prints), no
 matter what I tried to do, compression-wise.
 600dpi black and white, followed by saving as TIFF's with CCITT Group 4
 compression, produced immensely smaller files (small 100's of KB's for the
 same pages), and they are quite readable (even the fine letter seems to be
 readable - b/6 is quite distinguishable, etc). 
If you?re looking to scan for human consumption, bitmap works ok.  But I?ve found that OCR
programs seem to want grayscale.  Why that is, I don?t know; they do seem to  convert it
to bitmap at some point.  Possibly the threshold logic is more complex.
That brings up thresholds.  When scanning, or converting to, bitmap, you have to set the
gray threshold that is the cutoff between white and black.  The default would typically be
128 (50%).  Depending on the scanner and the condition of the originals, that threshold
may be fine, or it may be far off the optimal.  A good approach is to scan a number of
representative pages in grayscale, and experiment with different threshold settings to see
which one is the best.  Basically, you?re looking for the compromise between filled in
loops, and broken thin lines.  For printed originals, this is probably not all that
critical; for typewritten material, it is far more so.
        paul

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

DEC scanned documents for Bitsavers (message for Al Kossow)