So, you're
fortunate if you can get 80% accuracy with an OCR engine. 
 But what if you use
mutiple OCR programs?  Say, three different OCR
 programs and then process the results taking a majority vote on each
 resulting character?  Even if all three mangle 20%, it won't always
 be the *same* 20% across all three, right?  (and if I did my math
 right, using three 80% accurate programs reduces the error rate to
 just 0.8% or a 99.2% accuracy rate) 
 
That figure is right *if* those 20% errors are independent and randomly
distributed.  I doubt that either part of that is so - they will all
tend to error on the dubious characters (FWVO "dubious"), if nothing
else.
/~\ The ASCII                           der Mouse
\ / Ribbon Campaign
 X  Against HTML               mouse at rodents.montreal.qc.ca
/ \ Email!           7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B