Scanning docs for bitsavers

2 Dec 2019

On 12/2/19 5:34 PM, Guy Dunphy via cctalk wrote:
Interesting comments Guy.
I'm completely naive when it comes to scanning things for preservation.
  Your comments do pass my naive understanding.
...
  But PDF literally cannot be used as a wrapper for the
results,
 since it doesn't incorporate the required image compression formats.
 This is why I use things like html structuring, wrapped as either a zip
 file or RARbook format. Because there is no other option at present.
 There will be eventually. Just not yet. PDF has to be either greatly
 extended, or replaced. 
I *HATE* doing anything with PDFs other than reading them.  My opinion
is that PDF is where information goes to die.  Creating the PDF was the
last time that anything other than a human could use the information as
a unit.  Now, in the future, it's all chopped up lines of text that may
be in a nonsensical order.  I believe it will take humans (or something
yet to be created with human like ability) to make sense of the content
and recreate it in a new form for further consumption.
Have you done any looking at ePub?  My understanding is that they are a
zip of a directory structure of HTML and associated files.  That sounds
quite similar to what you're describing.
...
  And that's why I get upset when people physically
destroy rare old
 documents during or after scanning them currently. It happens so
 frequently, that by the time we have a technically adequate document
 coding scheme, a lot of old documents won't have any surviving
 paper copies.  They'll be gone forever, with only really crap quality
 scans surviving. 
Fair enough.
--
Grant. . . .
unix || die

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Scanning docs for bitsavers