We've now named quite a lot of applications and concepts about how to
handle scanned documents. I'd like to get the big picture:
--
here is the current workflow that I'm using
Ricoh IS520 30page/minute duplex B&W scanner that can handle up to 11 x 17
with crappy Windows scan app (under vmware) generates even / odd pages into
two separate directories. Since it turned out that scanning wide-edge first
results in straighter scans, I need to postprocess front and back separately
since the backs are flipped 180 deg.
The tiffs are on a shared file system with my Mac. I do all of the cleanup
on there with Graphic Converter. I have a bunch of scripts I run over the files
to produce a single directory with the pages brought to size and the boarders
cleaned up. I then cull the dups/bad scans, check that I didn't miss any pages
then save the files with the page number as the file name. Then I squirt the dir
back over to the linux box (since tumble is little-endian only) and say:
tumble -b %F * ../file.pdf
Copy it back over to the Mac, add it to the postprocessed scans and put the
whole thing in a 'to be archived' dir where it is saved to DVD-R
If there are pages that can't be scanned on the Ricoh, I do those separately
on a Mustek 11x17 flatbed.
Then, ftp the pdf to bitsavers, and update the index files.