I *hate* HTML for any document type stuff, but in this
case it seems
like it *might* be the best option, considering the alternatives :-(
HTML is just text and, if push comes to shove, is easy to
discard with a simple matter of programming. For anything
else, (TIFF, PDF, whatever) more programming effort is
required to come up with searchable text.
If it's simple markup without all sorts of extra
HTML crap
then *maybe*
it'd be OK. I honestly don't know :-)
The main advantage of html (as I see it) is that you can
keep the flow of text and pictures roughly as it was
meant to be (rather than having text full of "see figure 1").
Of course then there's the question - do you
preserve an old
document in
the format which the author intended (complete with typos, any bad
layout etc.), or is the intention to just preserve textual/graphical
data - possibly losing font and layout information in the process?
Well the British Library are spending a good deal on
digitising all their old stuff (Magna Carta, that kind
of thing). They are (apparently) taking high resolution
photographs (at resolutions sufficiently high that you
would not dare say "600 dpi" in the same room) and
making them available on the net (eventually).
I don't know if they are using digital cameras (not
your average high street jobbies here!) or using
high quality film cameras and then digitising.
Either way, if they are going to all that trouble
for their stuff, surely we can do the same for
stuff that really matters :-)
Antonio
--
---------------
Antonio Carlini arcarlini(a)iee.org