Scanning docs for bitsavers

3 Dec 2019

very nice? file
yep, we prefer pdf? ?with? ocr? ?back? stuff? ?ed smecc,orgIn a message dated 12/2/2019
8:20:36 PM US Mountain Standard Time, cctalk at classiccmp.org writes:
I cannot understand your problems with PDF files.
I've created lots and lots of PDFs, with treated and untreated scanned
material. All of them are very readable and in use for years. Of course,
garbage in, garbage out. I take the utmost care in my scans to have good
enough source files, so I can create great PDFs.
Of course, Guy's commens are very informative and I'll learn more from it.
But I still believe in good preservation using PDF files. FOR ME it is the
best we have in encapsulating info. Forget HTMLs.
Please, take a look at this PDF, and tell me: Isn't that good enough for
preservation/use?
https://drive.google.com/file/d/0B7yahi4JC3juSVVkOEhwRWdUR1E/view
Thanks
Alexandre
---8<---Corte aqui---8<---
http://www.tabajara-labs.blogspot.com
http://www.tabalabs.com.br
---8<---Corte aqui---8<---
Em ter., 3 de dez. de 2019 ?s 00:08, Grant Taylor via cctalk <
cctalk at classiccmp.org> escreveu:
...
  On 12/2/19 5:34 PM, Guy Dunphy via cctalk wrote:
 Interesting comments Guy.
 I'm completely naive when it comes to scanning things for preservation.
? Your comments do pass my naive understanding.
  But PDF literally cannot be used as a wrapper for
the results,
 since it doesn't incorporate the required image compression formats.
 This is why I use things like html structuring, wrapped as either a zip
 file or RARbook format. Because there is no other option at present.
 There will be eventually. Just not yet. PDF has to be either greatly
 extended, or replaced. 
 I *HATE* doing anything with PDFs other than reading them.? My opinion
 is that PDF is where information goes to die.? Creating the PDF was the
 last time that anything other than a human could use the information as
 a unit.? Now, in the future, it's all chopped up lines of text that may
 be in a nonsensical order.? I believe it will take humans (or something
 yet to be created with human like ability) to make sense of the content
 and recreate it in a new form for further consumption.
 Have you done any looking at ePub?? My understanding is that they are a
 zip of a directory structure of HTML and associated files.? That sounds
 quite similar to what you're describing.
  And that's why I get upset when people
physically destroy rare old
 documents during or after scanning them currently. It happens so
 frequently, that by the time we have a technically adequate document
 coding scheme, a lot of old documents won't have any surviving
 paper copies.? They'll be gone forever, with only really crap quality
 scans surviving. 
 Fair enough.
 --
 Grant. . . .
 unix || die

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Scanning docs for bitsavers