This is a read-only archive. Find the latest Linux articles, documentation, and answers at the new!

Catdoc / antiword for search engine

Posted by: Anonymous Coward on August 09, 2006 10:39 PM
I've just been testing antiword and catdoc for Windows. They are used by <a href="" title="">phpdig</a>, a search engine (*).

The last version of antiword outputs [pic] instead of garbage when finding an image in a word file. The Win32 port of catdoc is keeping the garbage.

When dealing with enormous file it improves performance a lot.

I've not tried the linux binary.

(*) <a href="" title="">ht://dig</a> also uses catdoc when digging Winword files.


Return to Viewing Word files at the command line