On Mon, 8 Apr 2002, Douglas H. Quebbeman wrote:
Looking at it another way, DjVu condenses any image
fed into it
into a mathemetical expression that, when evaluated, yields
as its result, the image of the original document.
So, it's nothing like OCR. If the original image were a page full
of little apples, the program will decide which apple is the best
one, and when it reconstitutes the original image, will put as
many copies of the one apple on the page as the original had. If
there are subtle differences between the apples that the eye
won't readily see, then the reconstituted image won't have those
subtle differences.
It goes beyond this too; it separates the text and calls that
foreground, and everything that's not text is background. The
background is compressed with a different family of wavelets
than is used for the foreground.
Nice. This is exactly what I had been looking for. Now if I can only come
up with the funds to buy the system I wanted to use for processing large
volumes of documents :)
-Toth