Chapter 3. Basic Indexing

The preceding chapter should have given you an idea of how Sphinx works in general, how you install it, and how you create simple indexes. But there’s much more to indexing and searching. This chapter covers “basic” indexing concepts and techniques that you need to know and use on a daily basis (those days when you’re actually working with Sphinx, of course).

Indexing SQL Data

There is usually something more to fetching data to index than just a single SQL SELECT * kind of a query, and Sphinx has a number of features to support that complexity. In real-world environments, you likely need to perform certain maintenance SQL actions at different indexing stages. For performance reasons, on databases that seem to be growing by orders of magnitude these days, you would also want to avoid selecting everything in one go, and instead, divide and conquer. Sphinx SQL sources provide the following kinds of queries to let you do that:

  • Main data-fetching query (the only one you are required to have)

  • Pre-queries (run before the main query)

  • Post-queries (run after main the query)

  • Post-index queries (run on indexing completion)

  • Ranged queries (a mechanism to run multiple parameterized main queries)

Main Fetch Query

Every SQL data source should be associated with an sql_query directive, which runs the main data-fetching query and indexes the database rows it returns. The first column in the query is always interpreted as a document ID, and other columns are interpreted ...

Get Introduction to Search with Sphinx now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.