Jan-Benedict Glaw wrote:
- How do you scan a paper document? Page by page? Two
pages at
once (with a sufficient large scanner)? Do you use a script
or something like that? ...or are there well-working
applications out there that aid in scanning some 100 pages?
Both the previous place I was at and my current employer have
B&W scanners with an autofeeder and scan to PDF. The previous
one would drop a PDF on the desktop, the current one emails
it to me. Both would do double-sided.
Do you directly scan b/w, or first use
grayscale/colour and
then degrade that to b/w?
Scan in bitonal (1-bit, B&W). Then I (usually, now) post
process to convert to G4 encoded TIFFs within a PDF.
- How do you work on the scanned images: Do you cut
off the
white rim as much as possible?
I leave them as they come. If I scan a booklet, or something
that cannot be non-destructively taken apart (and later
reassembled) then I will scan two pages at a time and
post-process manually. In that case I'll probably end up
cropping at the physical page edges.
How do you deal with
images
that are a tad rotated? Accept that? Re-scan to hopefully
get a better image? Revert rotation in software?
If it is bad I will try to rescan, especially if it is only a
few pages. I've never tried to rotate in software.
How do you
deal with single black dots in white areas or the other way
around?
Never worried about that.
- What digital format do you like to get when it's
all
finished? Plain PDF? PDF with some bookmarks? PDF with all
headings as bookmarks? A new PDF-hyperref based index?
Multiple TIFF/PNG/whatever images? Something like a
web-based slide-show? ...or multiple formats (web-based for
viewing, PDF for printing, ...)?
I produce PDF as the final format. I do not usually bother
adding bookmarks or whatever.
- What do you currently use as your software:
Operating system:
Linux (Debian), Solaris, VMS, DOS and Windows.
PDF viewer:
Acrobat or xpdf.
TIFF viewer:
IrfanView, but most things I use PDF
Browser/other viewers you'd love to use:
I'd love to see near-perfect OCR (average error
rate of say one missed/misinterpreted character
per 500 pages on 5th generation photocopies of
technical manuals from the 1960s).
If I get a second wish, I would like something that
can take multicolour text pages (like the
RSX-111 MPLUS manuals) and slice it and dice
it into multiple layers (blue/pink/red/black
etc. and put it back together as a PDF.
I still have hundreds (maybe more) pages waiting
to be processed - each is currently a 24MB (or so)
24-bit colour TIFF (at 600dpi). Help :-)
Antonio
--
---------------
Antonio Carlini arcarlini at
iee.org