scanning a ton of documentation

geneb geneb at
Wed Sep 22 16:51:04 CDT 2021

On Wed, 22 Sep 2021, Christian Corti via cctalk wrote:

> On Wed, 22 Sep 2021, Jay Jaeger wrote:
>> B/W, CCITT Group 4 tiffs at 400dpi is what I do, but then I also
> 600 DPI should be the absolute mininum today. There is absolutely no reason 
> to go below that for B/W.

>> Bitsavers will post process and create a searchable PDF
> Since when?

No idea, but the IA will process the upload into html, plain text, mobi, 
and OCRd PDF.

If I'm scanning bound books, those end up as indiviual tiff images (one 
per page).  At the end of processing those, they get stuffed into a zip 
file with the suffix "cbz" (Comic Book Zip) and once uploaded, the derive 
task at the IA handles all the OCR work as well as creating those other 
formats.  I'm pretty sure it will do the same with uploaded PDF files.


Proud owner of F-15C 80-0007 - The only one of its kind. - Go Collimated or Go Home.
Some people collect things for a hobby.  Geeks collect hobbies.

ScarletDME - The red hot Data Management Environment
A Multi-Value database for the masses, not the classes. - Get it _today_!

More information about the cctalk mailing list