Scanning Suggestions (Bookmarks & Colour)

æstrid smith Astrid at xrtc.net
Sat Aug 28 03:21:06 CDT 2021


i've achieved satisfactory results paletteizing scans of low-color-depth material using a tool called 'noteshrink':

https://mzucker.github.io/2016/09/20/noteshrink.html

-- 
æstrid smith (she/her)
=<[ c y b e r ]>=
antique telephone collectors association member #4870



On Fri, Aug 27, 2021, at 13:50, Antonio Carlini via cctalk wrote:
> I have a few manuals to scan and I'm looking for suggestions, about how 
> to add bookmarks and how to handle colour.
> 
> Bookmarks should be easier, so lets start with that. I want to add 
> bookmarks (or whatever they are called) so that it is easy to navigate 
> to page "2-48" or "C-17" in a document. Many of the PDFs on bitsavers 
> have that and I've found it very useful so I'd like to do that for my 
> future scans. I've tried with pdftk (the Java port as the original is no 
> longer available on my distro) but that failed. So I tried GhostScript 
> and that also failed, while also rewriting the PDF to be considerably 
> larger. Is there simple way to achieve this (ideally from the CLI)?
> 
> 
> Now for the scanning itself.
> 
> For manuals that are simple monochrome, I plan to scan at 600dpi bilevel 
> G4 encoded, wrapped in PDF.
> For photographs or shaded areas that don't necessarily come out well 
> under those settings, I plan to use 8-bit greyscale. I'd prefer to use 
> 600dpi but I may have to fall back to 300dpi if the per-page fiile size 
> shoots up too much.
> 
> The real issue is colour. I know that various people have looked at the 
> issue of how to efficiently scan pages that are mostly black and white 
> but have some coloured text (RSX-11 manuals and early VMS manuals did 
> this to highlight terminal input, for example). I don't think this is a 
> solved problem and I'm not expecting a solution, what I'm really looking 
> for is to check that what I'm about to produce will have all the 
> information that a future efficient algorithm is likely to need.
> 
> I'm going to start by scanning the whole manual as though it had no 
> colour (so 600 dpi bilevel G4 encoded, except for pages with photos and 
> shading and so on). Then I'm going to go back and rescan the pages that 
> have colour and scan those at 600 dpi and save as a JPG. Then I'll 
> produce a final PDF with the colour pages inserted. I'll also produce a 
> PDF with the B&W pages that were replaced by colour pages (I assume OCR 
> will be better served by non-jaggy scans).
> 
> So the final outputs will be:
> manual.pdf  - the whole manual, including whole pages scanned as colour 
> if any colour is present on them
> manual_BW.pdf  - the G4-encoded bilevel pages that were replaced by 
> colour pages
> 
> Thanks
> 
> 
> Antonio
> 
> 
> -- 
> 
> Antonio Carlini
> antonio at acarlini.com
> 
> 


More information about the cctalk mailing list