On 20 May 2005 at 11:20, Al Kossow wrote:
Since it turned out that scanning wide-edge first
results in straighter
scans....
In preparation for scanning and PDFing the 200-odd HP manuals in my
possession, I've been experimenting with batch image-processing programs.
For deskewing scans, the Leptonica library at:
http://www.leptonica.com/
...provides a simple solution that works quite well. Deskewing is
literally little more than calling the "pixRead", "pixDeskew", and
"pixWrite" library functions.
For pages containing text and screened photos, I scan once at 600 dpi
bilevel (for the text) and a second time at 200 dpi grayscale for the photo
using the descreening feature of the (horrible) HP imaging software that
came with the scanner. Manually, I erase the screened area from the first
image and crop the second, saving the latter as a JPEG. I've modified
"tumble" to composite images, so the resulting PDF page has the TIFF G4
text background with the JPEG photo superimposed.
I wrote a simple masking program to clean up the edges of the scanned
images. I wrote another program that takes a directory of image files,
parses the filenames for section, chapter, and page number information
encoded in the names, and creates a "tumble" control file to create the PDF
with appropriate bookmarks, page labels, and blank pages -- the latter to
allow for easy duplex printing (I've also modified "tumble" to create blank
PDF pages instead of embedding a blank TIFF page image).
Finally, I use Ghostscript to linearize the tumbled PDF.
The only significant manual work is rescanning the photos in order to
descreen them. I'd like to find a batch descreener that would take a
bilevel screened image file and produce a grayscale image. The Leptonica
library has such a function, but my first attempts yielded visible Moire
patterns. I need to investigate further.
-- Dave