Inventory for handling scanned documents (was: Better indexing on bitsavers)

20 May 2005

We've now named quite a lot of applications and concepts about how to
handle scanned documents. I'd like to get the big picture:
--
here is the current workflow that I'm using
Ricoh IS520 30page/minute duplex B&W scanner that can handle up to 11 x 17
with crappy Windows scan app (under vmware) generates even / odd pages into
two separate directories. Since it turned out that scanning wide-edge first
results in straighter scans, I need to postprocess front and back separately
since the backs are flipped 180 deg.
The tiffs are on a shared file system with my Mac. I do all of the cleanup
on there with Graphic Converter. I have a bunch of scripts I run over the files
to produce a single directory with the pages brought to size and the boarders
cleaned up. I then cull the dups/bad scans, check that I didn't miss any pages
then save the files with the page number as the file name. Then I squirt the dir
back over to the linux box (since tumble is little-endian only) and say:
tumble -b %F * ../file.pdf
Copy it back over to the Mac, add it to the postprocessed scans and put the
whole thing in a 'to be archived' dir where it is saved to DVD-R
If there are pages that can't be scanned on the Ricoh, I do those separately
on a Mustek 11x17 flatbed.
Then, ftp the pdf to bitsavers, and update the index files.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Inventory for handling scanned documents (was: Better indexing on bitsavers)