This is a read-only archive. Find the latest Linux articles, documentation, and answers at the new!


Posted by: Anonymous Coward on October 05, 2006 12:34 PM
You didn't name your default OCR "engine", but since this is the best option for FOSS apps, I'm willing to bet that your OCR app isn't FOSS. That means it costs a couple hundred bucks at a minimum, and maybe thousands.

As others have pointed out, Tesseract handles quite a few formats if converted. Thanks to the ease of the command line, ease of creating scripts, and ease of GNU/Linux in general, it may take an extra step or two, maybe not even that, to convert the target file, process it and dump the result into a new file. Thanks to the command line, bash, scripting, perl, and a few other tricks, its probably the same number of steps or less compared to your proprietary app, and much faster.

A final point, especially for you naysayers. Where the app in question turns out to be a useful app, where it can replace a proprietary app, where it can replace an expensive app, where it can be incorporated into a project like KDE which has developers who like to incorporate everything useful for desktop computing, it and many other apps generally receive enough attention to the point where the app very quickly becomes much better and much more efficient than the proprietary apps that perform similar tasks.

You go on using your proprietary OCR app. And paying for the privilege. And paying for updates. And paying for bug fixes. And bothering with registration keys. And bothering with dongles and other prove-your-innocence tactics. And forced upgrades.

After all, without you, where would the proprietary industry be?


Return to Google's Tesseract OCR engine is a quantum leap forward