Chapter 4. Search

Everything you’ve learned so far about creating indexes is pretty useless if you don’t know how to use those indexes to find what you are looking for. After all, that’s what Ferret is for. This chapter covers everything you need to know about searching in Ferret. We’ll start with the basic search classes followed by the various types of query. We’ll then talk about the query parser and Ferret’s own query language—FQL. We’ll then cover some more advanced topics such as sorting, filtering, and highlighting.

Overview of Searching Classes

Ferret’s search API is about as simple as its indexing API. In fact, if you are using the Index class, all you have to know is the search_each() method and a little bit of Ferret’s query language and you are set. However, if you take the time to learn the rest of the search API, you’ll discover a wealth of opportunities you didn’t even know existed.

The search API consists of the following classes:

  • IndexSearcher

  • Query

  • QueryParser

  • Filter

  • Sort

IndexSearcher

IndexSearcher, as the name would suggest, is used to search indexes. You can also use it to highlight and explain query results and read documents from the index (as you would with IndexReader). To create an IndexSearcher, you need to supply it with an IndexReader:

reader = IndexReader.new("path/to/index") 
searcher = Searcher.new(reader)

As usual, you can shortcut this by supplying it with a Directory or a filesystem path to the index:

searcher = Searcher.new("path/to/index")

Query

Ferret contains more than 15 different types of query, each of which you’ll learn about later in this chapter. Basically, queries are built and combined to specify what exactly it is you are looking for. You can then pass them to the IndexSearcher so it will retrieve your result set. Queries are the fundamental building block of the search API.

QueryParser

With more than 15 different types of query (each with its own definitive API), it can get quite tedious to build them by hand. Succinct as Ruby code is, it is much easier to build queries using a simple query language, not to mention the fact that you wouldn’t want users to have to type Ruby code into your search box. For example, let’s say we wanted to search for all articles in a blog that have the words “ruby” and “ferret” in either the title field or the content field. You could use the QueryParser:

query = query_parser.parse("title|content:(ruby AND ferret)")

Or you could build the query yourself. The QueryParser is the magic behind the Index class that makes it so easy to use.

Filter

You can already specify exactly which documents you want to find using the various Query classes, so you might be wondering what you need a Filter class for. Filters have a few purposes. First, Filters actually cache their results, so if you have a particular query that is run over and over again, you might want to convert it to a Filter to improve performance. Second, Filters can be used to apply common constraints to Queries. For example, to restrict a user’s search to only published articles, you would use a Filter. Or you might let users filter their own searches with a drop-down menu of common filters, perhaps a filter that restricts search results to the last month or the last seven days. This is particularly useful because most users won’t know the range query syntax. You’ll learn more about when to use a filter and how to create your own filter later in this chapter.

Sort

By default, the IndexSearcher returns your query results in order of relevance. If you want to sort the results in any other way, you are going to have to use the Sort class. As discussed earlier in Chapter 2, you can currently sort by Integer, Float, and String. We’ll cover this in more detail in the Sorting Search Results” section later in this chapter.

Get Ferret now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.