If you OCR, always archive the bitmaps too - Re: Regarding Manuals
Antonio Carlini
a.carlini at ntlworld.com
Sun Sep 27 10:49:25 CDT 2015
On 27/09/15 15:08, Johnny Billquist wrote:
>
> Errors are always bad. Agreed. That is not something we're discussing
> here.
>
> I don't have problems reading the current scans, as such. But when
> having ten of these open at the same time, and scrolling through them,
> it becomes obvious that the bitmaps are heavy. It can take a while for
> the screen to be updated. Not to mention the problems you sometimes
> hits with searching...
>
I think we are discussing errors. I did try to OCR stuff when I first
started scanning I didn't find anything
that could do an even marginally acceptable job. That's perhaps less of
an issue for War and Peace but
pretty serious for a technical manual.
I understand that having multiple 100MiB+ documents open at once will be
sluggish, certainly compared
to those same documents once they've been through OCR. However, if I
scan something and make the
raw scan available, someone can OCR it later (and re-upload just the OCR
version if they want). If I OCR
it and don't make the raw scan available then people are potentially
stuck with whatever OCR could manage
in 2015 (or earlier) and future-you will rightly curse me ten years down
the line (I'm assuming that OCR is
getting better with time, of course ....).
Antonio
arcarlini at iee.org
More information about the cctalk
mailing list