Recoll can handle plain text, HTML, OpenOffice.org documents, Mozilla Thunderbird and Evolution email messages, and Lyx and Scribus files. In addition to those native formats, Recoll can also work with other file types by using external helper applications. For example, the Xpdf software provides support for PDF files, while Word, PowerPoint and Excel documents are handled by Antiword and catdoc. If you want to enable support for document types that require external helpers, you have to install the helper apps separately using your distro's package manager (a list of the required external helpers is available at Recoll's Web site).
Recoll stores all internal data in Unicode UTF-8 format, but it can index files with different character sets, encodings, and languages into the same index.
Since Recoll's Web site provides binary packages for most major Linux distributions -- such as Fedora, SUSE, Ubuntu, and Debian -- you can install it easily using your distro's package manager. You can then launch Recoll by choosing Recoll from the Applications -> Accessories menu (in Ubuntu) or running the recoll command in a terminal window.
During the first run, you will be prompted to create a default set of configuration files that will contain all Recoll's settings. Recoll doesn't provide a GUI configuration tool, so you have to edit the configuration files manually. Fortunately, Recoll's user manual provides a detailed description of the configuration options that you can tweak. However, since Recoll's default settings cover all the basics, you might not need to edit them.
Like any desktop search engine, Recoll must index documents before it can search them. By default, Recoll indexes the files in your home directory, but you can specify another or additional locations. During the first run Recoll performs a full indexing, which can take some time. Once Recoll has built an index, you can update it manually using the recollindex command. You can also run recollindex as a cron job. Alternatively, you can run the recollindex -m command, which runs as a daemon that indexes modified files in real time.
| |
| Recoll results - click to enlarge |
from:"tristram shandy" linux AND openoffice -windows finds documents containing the word "tristram shandy" in the from field (useful when searching email messages) as well as the words "linux" and "openoffice" but not the word windows.
The Advanced Search feature can be used to create even more advanced queries. The default fields (called Clauses) allow you to specify a wide range of criteria, such as proximity, unlimited number of search terms (you can add extra fields by pressing the Add clause button), excluded words, and wildcards. You can also narrow your search to specific file types or a specific directory.
When you perform a search, Recoll displays the results in the main window. Each search result contains a file type icon, relevance in %, and context surrounding the search term. There are also two links: the Preview link allows you to quickly preview the document in a separate window, while the Edit link opens the file for editing in an appropriate application.
Finally, Recoll also features a Term Explorer tool (Tools -> Term Explorer) that can come in handy when you don't remember the exact spelling of a particular search term. Basically, it acts as a mini search engine that searches the index. This allows you to see all the derivatives of the entered search terms and select the one you need.
Although Recoll looks deceptively simple, it is indeed a powerful desktop search engine. To get the most out of it, make sure to read Recoll's user manual, paying particular attention to the tips and tricks section.
Dmitri Popov is a freelance writer whose articles have appeared in Russian, British, US, German, and Danish computer magazines.
Note: Comments are owned by the poster. We are not responsible for their content.
addenda
Posted by: Anonymous Coward on April 24, 2007 02:45 AM* Recoll is included in ALT Linux: no need to download anything by hand, just apt-get install recoll
--
Michael Shigorin
#