Re: OCR --
Trouble I've had is (and this is just pickiness, if the
actual info's all you care about then it's no prob) you invariably
lose the font and other aspects of the original appearance of the
document, which is a bummer. I converted a PDF of Sun Remarketing's
Lisa DIY guide into HTML with images because I wanted search engines
to be able to index the content.
The ability to search the PDF would be nice, but I think the
amount of work required to do the OCR and then do all the formatting
and such would outweigh that benifit, though the OCR'd PDF's tend to
be smaller as well. I'd prefer to keep the original layout, fonts
and all, though.
What comes as relief -- if you have that software -- you can have
PDF with two layers: a searchable OCRed layer and a viewable pixel
layer. You view the pixels and search the OCRed text. I have been
told it works nicely.
-Gunther
--
Gunther Schadow, M.D., Ph.D. gschadow(a)regenstrief.org
Medical Information Scientist Regenstrief Institute for Health Care
Adjunct Assistant Professor Indiana University School of Medicine
tel:1(317)630-7960