Richard wrote:
In article <4564AEE4.1040709 at yahoo.co.uk>,
Jules Richardson <julesrichardsonuk at yahoo.co.uk> writes:
For some documents there are very few surviving
copies, and it would seem
sensible to preserve those ones now in a form that was as close as reasonably
possible to the original - colour scans where colour exists in the document
(or the paper is non-white or the text non-black I suppose), greyscale for
text rather than bi-level, sufficient resolution for photos and diagrams etc.
I scan this way -- I use color or grayscale where necessary and use
bi-level on text only B&W pages. bi-level also works for diagrams
that consist only of line art without fine detail.
It's OK if you have top-quality documentation. But lots of computer docs out
there are old, faded, dirty, creased, well-thumbed etc. and unless someone's
prepared to visually check every scanned page, there's a chance that the
bi-level algorithm in use will corrupt the data and it'll go unnoticed.
If using greyscale then such artifacts can be dealt with at a later date (i.e.
at OCR time) rather than at scan time. Of course, OCR is by its nature slow as
the data has to be messed with at input anyway and checked upon output, so
some extra tweaking there isn't so bad - but it's useful if the initial scan
process can be as quick as possible (it's time-consuming enough as it is!)
I suppose it's one of those situations where you end up throwing away
information no matter what (after all, any scan is essentially a digital
representation of analogue data), but there's a danger of throwing away too
much data - and for rare docs you might only get the chance to scan them once.
For rare items I'd rather have maximum quality "just in case", even if it
does
mean more storage space.
cheers
Jules