On Fri, 2005-06-03 at 10:20 -0700, Dwight K. Elvey wrote:
From:
"Jules Richardson" <julesrichardsonuk at yahoo.co.uk>
On Thu, 2005-06-02 at 23:15 -0700, Eric Smith wrote:
Does any TIFF file of the nature you describe
actually exist? PDFs
with both bitmaps and text are not uncommon.
I'm not sure I've seen one, and I've dealt with a *lot* of TIFF files
over the years.
Other metadata such as the app that created the image etc. is quite
common though, and I have a feeling that Photoshop puts in all sorts of
extra tags (I haven't got a copy here with which to check)
Whether any such tags are useful to preserve is another matter.
Personally I like the accountability; I'd like to know who scanned a
document, when they scanned it, what software they used to do the scan.
Mainly because it may help at some future OCR stage in identifying ways
of improving the process or runs of documents that are likely to cause
trouble during the OCR phase. Plus of course it's nice to know who was
responsible for the hard work!
cheers
Jules
Hi
The biggest problem I think TIFF has is that it is really just
a container and not truly a image format. From what I know,
one can put just about any data stream into a TIFF.
Pretty much - it was geared around storing image data, but it's flexible
enough to be able to store other things alongside the primary image
content (and I suppose store *no* image content if someone wanted to do
that). Much the same way that most document containers can store things
(charts, images etc.) other than text, I suppose.
When I said that I thought that a TIFF format was
better for
archiving I intended it to mean an non-compress scanned image
thet is in a form that has little encoding.
Oh, so did I. I don't think I'd advocate that usage of TIFF either; I'd
forsee people using it for page scans only, and hopefully settling on
sensible settings that TIFF readers could handle. My observation was
that conversion to PDF may well be throwing useful info away at present,
as scanning apps / bitmap packages will typically put useful metadata-
type info into a TIFF image that will get lost at the conversion point.
Just because something is capable of doing all sorts of things doesn't
mean that they should be used.
At the end of the day it may be that PDF's the best choice (or best of a
bad lot) - I've just not seen a discussion that compares all the options
out there and weighs up the pros and cons (in the narrower context of
paper documentation scanning I suppose, rather than the wider field of
storing electronic documentation, which may typically have different
goals)
The thing not to do I guess is get bitten a few years down the line and
find that you have terabytes of data that need to change format!
cheers
Jules