Scanning docs for bitsavers
cctalk at gtaylor.tnetconsulting.net
Mon Dec 2 20:08:38 CST 2019
On 12/2/19 5:34 PM, Guy Dunphy via cctalk wrote:
Interesting comments Guy.
I'm completely naive when it comes to scanning things for preservation.
Your comments do pass my naive understanding.
> But PDF literally cannot be used as a wrapper for the results,
> since it doesn't incorporate the required image compression formats.
> This is why I use things like html structuring, wrapped as either a zip
> file or RARbook format. Because there is no other option at present.
> There will be eventually. Just not yet. PDF has to be either greatly
> extended, or replaced.
I *HATE* doing anything with PDFs other than reading them. My opinion
is that PDF is where information goes to die. Creating the PDF was the
last time that anything other than a human could use the information as
a unit. Now, in the future, it's all chopped up lines of text that may
be in a nonsensical order. I believe it will take humans (or something
yet to be created with human like ability) to make sense of the content
and recreate it in a new form for further consumption.
Have you done any looking at ePub? My understanding is that they are a
zip of a directory structure of HTML and associated files. That sounds
quite similar to what you're describing.
> And that's why I get upset when people physically destroy rare old
> documents during or after scanning them currently. It happens so
> frequently, that by the time we have a technically adequate document
> coding scheme, a lot of old documents won't have any surviving
> paper copies. They'll be gone forever, with only really crap quality
> scans surviving.
Grant. . . .
unix || die
More information about the cctech