Basic Search System Anatomy

There are two variants of common search systems. In the simpler model, users express their information need as a query that they enter in a search interface. They may do so using natural language or a specialized search language (e.g., Boolean operators such as AND, OR, or NOT). They may also use query builders, such as thesauri or stemming tools, to enhance or expand their queries.

Queries are then “matched” against an index that represents the site’s content. Most indexes consist of lists of all terms found in all of the site’s documents, while others may list document titles, authors, categories, and associated information. Indexes associate each term with its locations within the site’s documents. When the query’s terms are compared with the index’s terms, a set of matching documents is identified. The results are then sorted for presentation in some way that, ideally, ensures that the most relevant documents are ranked first.

The second variant of index, shown in Figure 8-1, is increasingly common: records containing metadata are created to represent each document. Both records and documents can be stored in databases, such as content management systems. The records include descriptive, administrative, and other metadata that may explain what the document is about, who is responsible for it, and how it should be maintained. Queries are executed against indexes drawn from these records’ fields, so the results are more likely to be useful.

Figure 8-1. The ...

Get Information Architecture for the World Wide Web, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.