On Sun, 2004-06-27 at 20:41, der Mouse wrote:
If you've
OCRed the data, HTML is probably fine for pure text.
Ugh! Please, no! For pure text, use - pure text!
Ahh, my point was that this would be text + images (well, typically).
I *hate* HTML for any document type stuff, but in this case it seems
like it *might* be the best option, considering the alternatives :-(
If it's simple markup without all sorts of extra HTML crap then *maybe*
it'd be OK. I honestly don't know :-)
Of course then there's the question - do you preserve an old document in
the format which the author intended (complete with typos, any bad
layout etc.), or is the intention to just preserve textual/graphical
data - possibly losing font and layout information in the process?