HP developed an OCR engine called Tesseract that is
supposed to be
pretty good. They released it to the open-source world, and Google has
picked it up and started working on it.
classiccmp list member James Markevitch has been working on an OCR program
as well, optimized for column formated input, like listings.
I was just talking to Doron Swade (the person responsible for the Difference
Engine at the British Science Museum) and he is interested in OCR of
mathematical tables (also column-oriented like listings).