On 12/2/19 5:34 PM, Guy Dunphy via cctalk wrote:
Interesting comments Guy.
I'm completely naive when it comes to scanning things for preservation.
Your comments do pass my naive understanding.
But PDF literally cannot be used as a wrapper for the
results,
since it doesn't incorporate the required image compression formats.
This is why I use things like html structuring, wrapped as either a zip
file or RARbook format. Because there is no other option at present.
There will be eventually. Just not yet. PDF has to be either greatly
extended, or replaced.
I *HATE* doing anything with PDFs other than reading them. My opinion
is that PDF is where information goes to die. Creating the PDF was the
last time that anything other than a human could use the information as
a unit. Now, in the future, it's all chopped up lines of text that may
be in a nonsensical order. I believe it will take humans (or something
yet to be created with human like ability) to make sense of the content
and recreate it in a new form for further consumption.
Have you done any looking at ePub? My understanding is that they are a
zip of a directory structure of HTML and associated files. That sounds
quite similar to what you're describing.
And that's why I get upset when people physically
destroy rare old
documents during or after scanning them currently. It happens so
frequently, that by the time we have a technically adequate document
coding scheme, a lot of old documents won't have any surviving
paper copies. They'll be gone forever, with only really crap quality
scans surviving.
Fair enough.
--
Grant. . . .
unix || die