In article <38547.195.212.29.92.1138178440.squirrel at mail.gjcp.net>,
gordonjcp at
gjcp.net writes:
OCR is the hard part and I've yet to hear of
anything that is
even close to remotely acceptable. At say 1000 words/page
a success rate of 99.9% still leaves you with one fix up
per page. That's a good chunk of work for even a small manual
(say 200 pages). It's a lot of work for an RT-11 manual set
or similar!
Sounds like an ideal thing for a distributed project.
Give everyone who registers a few pages to proof read, combine into
finished work. If you wanted cross-checking you'd just make sure that
different people got different batches at different times, and diff the
results.
Wouldn't this require that everyone have a copy of Acrobat Capture?
--
"The Direct3D Graphics Pipeline"-- code samples, sample chapter, FAQ:
<http://www.xmission.com/~legalize/book/>
Pilgrimage: Utah's annual demoparty
<http://pilgrimage.scene.org>