This is a read-only archive. Find the latest Linux articles, documentation, and answers at the new!

Needs a lot of work

Posted by: Anonymous Coward on October 01, 2006 05:19 PM
Tesseract in its present form is unusable. It needs to be able to handle formatting correctly, and the source looks like it needs some serious maintenance to sort out 64 bit compatibility issues. It looks like it might be better just to cherry pick the good parts, clean them up, and integrate them with gocr rather than try and make this usable.


Return to Google's Tesseract OCR engine is a quantum leap forward