Extracting PDF information

Posted by: Anonymous [ip:] on August 06, 2008 12:25 PM
Hi Martin,

Just some comments to the tracker-extractor/beagle-extract-contents comparison (from a tracker regular contributor):
1) The only useful information in the PDF is the page count and the title and both programs extract it
2) Tracker extract the contents of the PDF using a filter in a different call (beagles do it in the same program, but at the end we have the same information)
3) Tracker calls the extractor always with the mime-type. Probably it must be a mandatory parameter, to avoid confusion for the users.


