ChapterÂ 4.Â Search

Everything youâve learned so far about creating indexes is pretty useless if you donât know how to use those indexes to find what you are looking for. After all, thatâs what Ferret is for. This chapter covers everything you need to know about searching in Ferret. Weâll start with the basic search classes followed by the various types of query. Weâll then talk about the query parser and Ferretâs own query languageâFQL. Weâll then cover some more advanced topics such as sorting, filtering, and highlighting.

Overview of Searching Classes

Ferretâs search API is about as simple as its indexing API. In fact, if you are using the Index class, all you have to know is the search_each() method and a little bit of Ferretâs query language and you are set. However, if you take the time to learn the rest of the search API, youâll discover a wealth of opportunities you didnât even know existed.

The search API consists of the following classes:

IndexSearcher
Query
QueryParser
Filter
Sort

IndexSearcher

IndexSearcher, as the name would suggest, is used to search indexes. You can also use it to highlight and explain query results and read documents from the index (as you would with IndexReader). To create an IndexSearcher, you need to supply it with an IndexReader:

reader = IndexReader.new("path/to/index") 
searcher = Searcher.new(reader)

As usual, you can shortcut this by supplying it with a Directory or a filesystem path to the index:

searcher = Searcher.new("path/to/index")

Ferretâs MultiSearcher Class

Ferret also offers a MultiSearcher; however, its use is not recommended because you can do everything the MultiSearcher can do with an IndexSearcher by supplying it with a MultiReader:

readers = []
readers << IndexReader.new("path/to/index1")
readers << IndexReader.new("path/to/index2")
readers << IndexReader.new("path/to/index3")
multi_reader = IndexReader.new(readers)
searcher = Searcher.new(multi-reader)

Query

Ferret contains more than 15 different types of query, each of which youâll learn about later in this chapter. Basically, queries are built and combined to specify what exactly it is you are looking for. You can then pass them to the IndexSearcher so it will retrieve your result set. Queries are the fundamental building block of the search API.

QueryParser

With more than 15 different types of query (each with its own definitive API), it can get quite tedious to build them by hand. Succinct as Ruby code is, it is much easier to build queries using a simple query language, not to mention the fact that you wouldnât want users to have to type Ruby code into your search box. For example, letâs say we wanted to search for all articles in a blog that have the words ârubyâ and âferretâ in either the title field or the content field. You could use the QueryParser:

query = query_parser.parse("title|content:(ruby AND ferret)")

Or you could build the query yourself. The QueryParser is the magic behind the Index class that makes it so easy to use.

Filter

You can already specify exactly which documents you want to find using the various Query classes, so you might be wondering what you need a Filter class for. Filters have a few purposes. First, Filters actually cache their results, so if you have a particular query that is run over and over again, you might want to convert it to a Filter to improve performance. Second, Filters can be used to apply common constraints to Queries. For example, to restrict a userâs search to only published articles, you would use a Filter. Or you might let users filter their own searches with a drop-down menu of common filters, perhaps a filter that restricts search results to the last month or the last seven days. This is particularly useful because most users wonât know the range query syntax. Youâll learn more about when to use a filter and how to create your own filter later in this chapter.

Sort

By default, the IndexSearcher returns your query results in order of relevance. If you want to sort the results in any other way, you are going to have to use the Sort class. As discussed earlier in ChapterÂ 2, you can currently sort by Integer, Float, and String. Weâll cover this in more detail in the Sorting Search Resultsâ section later in this chapter.

Get Ferret now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Ferret by David Balmain