Setting Up the Index

We now know that the index is made up of documents, which, in turn, are made up of fields. What happens to the fields when they are added to the index? Are they all treated equally? By default, the answer is yes: each field is created with the same properties. But this is not always desirable. For example, Ferret stores all the text in all the fields unmodified by default. If you are indexing data from a database, this may not be necessary. Since you are already storing the data in the database, it is often pointless to store it in the Ferret index as well.[2]

FieldInfo

Each field in a Ferret index has its properties defined in a Ferret::Index::FieldInfo object. A FieldInfo is an immutable class with the following properties:

  • name

  • boost

  • stored?

  • compressed?

  • indexed?

  • tokenized?

  • omit_norms?

  • store_term_vectors?

  • store_positions?

  • store_offsets?

The FieldInfo#name property is a symbol used to match the FieldInfo object with a field in a document. FieldInfo#boost is the default boost that is given to each instance of the field when it is added to the index. This is where, for example, you would boost the :title field if you wanted it to have more weight in the search results than the :content field. The default value for #boost is 1.0.

The rest of these properties can be divided into three groups: store, index, and term_vector. These are the other parameters you can use to instantiate a new FieldInfo object. For example:

field_info = FieldInfo.new(:title,                 
                           :default_boost ...

Get Ferret now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.