On Thu, 2005-05-19 23:37:18 +0000, Jules Richardson <julesrichardsonuk at
yahoo.co.uk> wrote:
On Thu, 2005-05-19 at 16:02 -0700, Al Kossow wrote:
I tend to 'explode' any PDF files of scans (from whatever source) here
once downloaded into their own directory; I just find it easier to
manipulate via whatever image tool is most suitable for whatever I'm
doing at the time, rather than being stuck with a PDF viewer. I suppose
if I wanted to add metadata to that, I'd include an ASCII text file in
the directory full of images with the relevant info in (I've done that
with ROM and Disk images many a time; not needed to do it with Doc scans
yet*)
That's basically what I did for the TeX+Images -> PDF script, one .TXT
file per image. This way, you can easily distribute work if needed.
Maybe I'm atypical in usage :-) I'll rarely
want to download scans and
*not* keep a copy on local storage just in case, so I've never used the
"view a PDF file in a web browser" side of things.
No, I consider that as normal, fair and expected usage :)
Just a honest question. There is some kind of warez szene
scanning/ORCing/correcting scans of current best-selling books. How do
they do the job? They've basically got to solve the very same problem.
If something
different comes around, the PDF spec is public, and by using
such a small subset it should be simple to translate.
Yep true... plenty of tools already exist to pull PDF files apart. Well,
you'll be converting all your bitsavers content to futurekeep format
soon :-)
Pointers for tools? Even while I'm out of time, I'd like to learn more:)
MfG, JBG
--
Jan-Benedict Glaw jbglaw at lug-owl.de . +49-172-7608481 _ O _
"Eine Freie Meinung in einem Freien Kopf | Gegen Zensur | Gegen Krieg _ _ O
fuer einen Freien Staat voll Freier B?rger" | im Internet! | im Irak! O O O
ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));