This is a read-only archive. Find the latest Linux articles, documentation, and answers at the new Linux.com!

Linux.com

Extracting PDF information

Posted by: Anonymous [ip: 192.89.6.224] on August 06, 2008 12:25 PM
Hi Martin,

Just some comments to the tracker-extractor/beagle-extract-contents comparison (from a tracker regular contributor):
1) The only useful information in the PDF is the page count and the title and both programs extract it
2) Tracker extract the contents of the PDF using a filter in a different call (beagles do it in the same program, but at the end we have the same information)
3) Tracker calls the extractor always with the mime-type. Probably it must be a mandatory parameter, to avoid confusion for the users.

#

Return to Desktop search comparison: Beagle vs. Tracker