Better indexing on bitsavers

20 May 2005

On Thu, 2005-05-19 23:37:18 +0000, Jules Richardson <julesrichardsonuk at
yahoo.co.uk> wrote:
...
  On Thu, 2005-05-19 at 16:02 -0700, Al Kossow wrote:
 I tend to 'explode' any PDF files of scans (from whatever source) here
 once downloaded into their own directory; I just find it easier to
 manipulate via whatever image tool is most suitable for whatever I'm
 doing at the time, rather than being stuck with a PDF viewer. I suppose
 if I wanted to add metadata to that, I'd include an ASCII text file in
 the directory full of images with the relevant info in (I've done that
 with ROM and Disk images many a time; not needed to do it with Doc scans
 yet*) 
That's basically what I did for the TeX+Images -> PDF script, one .TXT
file per image. This way, you can easily distribute work if needed.
...
  Maybe I'm atypical in usage :-) I'll rarely
want to download scans and
 *not* keep a copy on local storage just in case, so I've never used the
 "view a PDF file in a web browser" side of things.  
No, I consider that as normal, fair and expected usage :)
Just a honest question. There is some kind of warez szene
scanning/ORCing/correcting scans of current best-selling books. How do
they do the job? They've basically got to solve the very same problem.
...
   If something
different comes around, the PDF spec is public, and by using
 such a small subset it should be simple to translate. 
 Yep true... plenty of tools already exist to pull PDF files apart. Well,
 you'll be converting all your bitsavers content to futurekeep format
 soon :-) 
Pointers for tools? Even while I'm out of time, I'd like to learn more:)
MfG, JBG
--
Jan-Benedict Glaw       jbglaw at lug-owl.de    . +49-172-7608481             _ O _
"Eine Freie Meinung in  einem Freien Kopf    | Gegen Zensur | Gegen Krieg  _ _ O
 fuer einen Freien Staat voll Freier B?rger" | im Internet! |   im Irak!   O O O
ret = do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA));

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Better indexing on bitsavers