Scanning question

18 Jul 2019

On 18/07/2019 22:50, Warner Losh via cctalk wrote:
...
  So, I have a bunch of old DEC Rainbow docs that
aren't online. I also have
 a snapscan scanner that I use for bills and such.
 There's four kinds of docs, and I'm looking for advice:
 (1) wire-ring bounded. What's the best way to scan these? The easiest is to
 just clip the wire binding and drop it in the scanner. But then what? 
The wire-bound ones I've dealt with you can hold the doc with the back
cover facing you and turn that cover over so that you are now looking at
the reverse of the last but one page. Then you "open out" the wires.
(This is easier if you have one of those comb binding machines that many
offices have). Then rotate the wires and watch the pages gently pop out.
Refitting is the reverse of removal. (Like it says in every Haynes manual).
...
  (2) Folded with staples. These are booklet format,
with stables in the
 middle. I could easily remove the staple and scan. but how do I replace the
 staple? 
Slowly and carefully, page by page. (The first hundred are the worst.
The second hundred are the worst too .. :-))
...

 (3) Gum bound. These books are bound with some kind of gum / goo on the
 spine. Some of these are so old I could just remove it and have no real
 degradation of the state. Others have spines that are still in good shape. 
The only bound docs that I've scanned have been ones where I had more
than one copy. There are scanners that scan right up to the edge
specifically so that you can scan books, but I've never had access to one.
...

 (4) Three ring binder. This is easy: remove, scan, replace. Right? 
Compared to all the others, yes, this is easy.
...

 Finally, how do I get the resulting scans into bigkeeper? Any fancy options
 I should enable to make the pdfs maximally useful? 
I usually go for bi-level TIFF 600 dpi, which can then be G4 encoded.
Then wrap the whole lot in a PDF. Except for photos I often went for
8-bit (grey levels). For colour I just got confused and either went for
300 dpi and sometimes also separately scanned the manual in B&W.
The office I'm in now has an HP scanner that thinks JPG is the only
reasonable option. JPG is awful for text (especially if you plan to OCR
it later) and line drawings and schematics.
I don't know what options there are on PDFs, but avoiding anything
clever is probably a good idea in the long term.
Antonio
--
Antonio Carlini
antonio at acarlini.com

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Scanning question