This is a read-only archive. Find the latest Linux articles, documentation, and answers at the new Linux.com!

Linux.com

Catdoc / antiword for search engine

Posted by: Anonymous Coward on August 09, 2006 10:39 PM
I've just been testing antiword and catdoc for Windows. They are used by <a href="http://www.phpdig.net/" title="phpdig.net">phpdig</a phpdig.net>, a search engine (*).

The last version of antiword outputs [pic] instead of garbage when finding an image in a word file. The Win32 port of catdoc is keeping the garbage.

When dealing with enormous file it improves performance a lot.

I've not tried the linux binary.

(*) <a href="http://www.htdig.org/" title="htdig.org">ht://dig</a htdig.org> also uses catdoc when digging Winword files.

#

Return to Viewing Word files at the command line