On Wed, 2005-06-01 at 22:28 -0700, Tom Jennings wrote:
> Please do
not waste any time making new PDF documents,
On Wed, 1 Jun 2005, Eric Smith wrote:
Please don't waste any time complaining about
PDF documents. In many
cases, you're lucky to get the data in any form at all.
While I do believe PDFs are often abused and partially deserve
their bloated reputation (Adobe pushes them too much) they can be
used properly as containers for multiple-component documents.
It's unfortunate that it's so much work to hand-type or "OCR" old
documents to produce PDFs as lovely as MSC's.
Well, the best that can be hoped for is that OCR technology will
gradually improve (one of the reasons I don't personally scan stuff as
bi-level). Obviously the ultimate goal would be to have text stored as
text (irrespective of surrounding markup - RTF / Word doc / HTML etc.)
but the technology's probably not quite there yet.
For the near future, the question's probably whether PDF over some other
encapsulation format or seperate scan-per-page approach is the better
choice (and there's the questions of what resolution / bit depth to scan
at, whether any intermediate processing should be done to correct skewed
pages etc.)
cheers
Jules