From: "der Mouse" <mouse at rodents.montreal.qc.ca>
Sent: Friday, June 03, 2005 3:06 PM
If you have to
muck with a PDF then the PDF should never have been
generated.
And that is what I (and others) have been saying all along: please
don't shove scans into a PDF!
PDF is an END format -- it is assumed that the
information and
graphics are perfect BEFORE creating it.
It thus assumes there will never be any reason to pull it apart. Even
if the information *is* perfect, that is often false.
If you have bad PDF files that you have a need to
manipulate, blame
the person who created the PDF, not the PDF format itself!
That's exactly what we have been doing - or more precisely, we've been
blaming those who chose PDF as the format for distributing
documentation scans.
Since no format will be perfect for all uses, choosing a packaging
format which is designed around the assumption that the package will
never be pulled apart is..broken.
What *is* your need to extract images out of PDF?
Usually, to postprocess the page scan for better results, whatever
"better" means at the moment in question.
For example, a page scan may have a different white point on one side
than the other, and I may want to remove that bias before printing. I
may want to take a greyscale-scanned page and convert it to bilevel for
printing (the printer's conversion, usually by dithering, will not
always be the best for what I want). I may want to throw the scan at
some OCR technology. I may even want to just look at it, or part of
it, on-screen - if you think PDF displayers are always at least as good
at displaying scanned page images as programs designed for image
display, you have either an unrealistically good impression of PDF
displayers or an unrealistically bad impression of image displayers.
/~\ The ASCII der Mouse
\ / Ribbon Campaign
X Against HTML mouse at rodents.montreal.qc.ca
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Instead of blaming people like myself for doing it "the wrong way" why
don't
you do it and let us judge if we like it your way.
If we like what you are doing better then we can all stop and let you do it
all ;-)
I'll set aside a few giga-bytes for all the manuals you scan.
Randy