On 2015-09-27 2:33 PM, Fred Cisin wrote:
On Sun, 27 Sep 2015, Pontus Pihlgren wrote:
It seems to me that a better tool could solve the
issue. One that
could display the OCR:ed content only and the scanned content
only when desired, for instance when you suspect an error.
Is there such a reader? Is the content organised to make it
possible.
I haven't seen one.
I did start trying to write an heuristic probabilistic OCR one 25 years
ago. The idea being to overlay the OCR'd (displayed with matching
fonts) over the scanned content. ...
DJVU compression is somewhat analogous to this process, ...
There was a somewhat scary case study on the web a few years ago (not
sure if it's still out there, haven't been able to find it)
Here it is.
The compression method was apparently JBIG2, but it could also affect DJVU.
--Toby
... The risks are obvious(*).
--Toby
* - Hat tip to PGN. comp.risks digest.