TIFF file (big)
The general consensus seems to be that bi-level scanning
with a resolution of at least 300dpi but preferably 400dpi
(although I tend to use 600dpi). G4 encoded TIFF is pretty
good space wise (obviously lously compared to text).
OCRed ASCII text (ugly)
OCR is (almost) certain to introduce errors. You'll need a
significant investment in proof-reading to fix this!
compressed PostScript of OCRed text (depending on OCR,
could be nice).
If you can OCR, then any format that can represent that text in
whatever fonts and layout the original document used (and uses
an efficient openly-documented format) should do. Most of my
text documents are PDF. You can turn PDF into text (or html I guess)
where appropriate.
But you cannot OCR (or at least, I bet you cannot OCR without
introducing errors).
Antonio
--
---------------
Antonio Carlini arcarlini(a)iee.org