Jim Battle wrote about DjVu:
So for OCR purposes, I don't think this type of
compression really hurts
-- it replaces one plausible "e" image with another one.
No, that's exactly the kind of BS you DO NOT WANT for a file that you
plan to OCR. What if you've got a mathematical formuala that has some
latin "e" letters and some greek epsilons in it? Or perhaps normal
and italic "e" letters? DjVu may well think they are "close enough",
while a good OCR program might be able to tell them apart accurately.
The point of wanting lossless compression is that even if a good
OCR program today can't tell them apart accurately, a good OCR program
ten years from now might.
But if you use lossy compression now, you are likely discarding
information that the OCR program will need.
Eric