On Fri, Aug 11, 2000 at 05:37:50PM -0400, Jim Oaks wrote:
Hello everyone, id like to announce a new project im
starting, The
ClassicMag project. The goal of this project is to preserve classic
Have you found a good technique for scanning them? It's labor-intensive
flipping the pages by hand. Also, the ongoing problem I've had with
half-tone images is that the dots beat with the sample dots of the scanner
and create a moire pattern. In 1994 I worked with an high-end Arcus
scanner which could corrrect this beautifully; most likely it was done
in software. But every time I ask supposedly knowledgeable people how
to do this in software I get lame answers like "just scan at a higher
resolution" or "scan at the same resolution as the dot pitch of the
screen"
or "use Gaussian blur to smooth the dots together". The first preserves
the screen very well, but that is not the goal; the goal is to recreate
the original unscreened photo as accurately as possible. The second
does not take into account the fact that scanners do not adjust their
optical resolution on demand.... a fundamentally 300DPI CCD told to scan
at 133 DPI is still going to beat with the halftone dots. The third
throws away a lot of information. I suspect the right solution involves
a really high-resolution scan, so that each halftone dot is an image
in itself; and then use some software technique. The maximum recoverable
resolution is probably the same as the dot pitch... or maybe a bit more
with the use of edge detection or something like that.
I doubt Acrobat solves this problem, but alas, it's not free software,
and it runs only on Macs and Windows machines, and costs significant
cash so I've never gotten around to trying it out... no doubt I would
be disappointed anyway. The Arcus scanner driver required you to specify
the screen pitch. I think most likely any software which does not ask
you for this information, is not going to do a very good job on screened
images.
Anyway... the ideal scanner would be able to accept a stack of unbound
magazine pages, hundreds at a time, and churn through them without
misfeeding; and there would be image processing software to get rid of
the moire patterns and make smooth photos of any areas which are screened,
always detecting those areas correctly (as opposed to text areas);
while also doing Acrobat's tricks of trying to OCR the text, and
substituting bitmap areas when that's not possible. I haven't started
this project myself because I can't find a method that doesn't suck.
Oh... and even if you had all this, some magazines have "sidebars"
which are text printed on top of a screened color background; and those
usually get detected as images by OCR software, so the text ends up
not being searchable. I have tried this using Omnipage, and its
image/text differentiation is usually wrong several times per page.
Meanwhile, there is a little known image format called DejaVu which
I have successfully used; it doesn't try to do any OCR, but it does
manage to store a surprisingly good image in a small amount of space.
The license terms for it are also not perfect but at least there is
a Linux version. You need a free browser plugin to view the images.
More info at
http://www.djvu.att.com/
For example, check out the Radio Shack DX300 shortwave receiver
service manual at
http://gw.kb7pwd.ampr.org/manuals/DX300/
Oh and in the end the majority of publishers would probably object to
this anyway... especially since they're such big conglomerates, one
solid complaint could destroy the biggest share of your work.
I'm interested in putting my small collection of Heathkit manuals online
too, but those copyrights are owned by a company which sells reprints,
and no doubt pursues copyright violations vigorously, since the reprints
are their main profit center.
Maybe Freenet will enable distribution of this stuff.
--
_______ Shawn T. Rutledge / KB7PWD ecloud(a)bigfoot.com
(_ | |_)
http://www.bigfoot.com/~ecloud kb7pwd(a)kb7pwd.ampr.org
__) | | \________________________________________________________________
Get money for spare CPU cycles at
http://www.ProcessTree.com/?sponsor=5903