13.4. Thesaurus File Configuration

Consider the case where a company produces a widget. The company Web site refers to the widget by the trademarked name: DooDad. Because the rest of the world thinks of them as widgets, that is how visitors will search for them. Entering the search term "widget" will result in zero results. The company wants to map the search term widget to the trademarked term DooDad. This is the purpose of the thesaurus file.

13.4.1. Thesaurus Files

Thesaurus files are used by the search engine to tailor the query for specific languages. The files are associated with the SSP Application ID in the following folder:

C:\Program Files\Microsoft Office Servers\12.0\Data\Office
Server\Applications\<Application ID (GUID)>\Config

Thesaurus files have a specific naming convention and XML format. The naming format is Ts<lang id>.xml. The U.S. English file is Tsenu.xml. The neutral English thesaurus file is Tsneu.xml. This file is a good choice if you are not creating a multilingual site, as it has a global impact on all English-language queries.

The default neutral English thesaurus file is shown in Listing 13-7.

Example 13-7. Default neutral English thesaurus file
<xml id="Microsoft Search Thesaurus"> <thesaurus xmlns="x-schema:tsSchema.xml"> <diacritics_sensitive>0</diacritics_sensitive> <expansion> <sub>Internet Explorer</sub> <sub>IE</sub> <sub>IE5</sub> </expansion> <replacement> <pat>NT5</pat> <pat>W2K</pat> <sub>Windows 2000</sub> </replacement> <expansion> <sub>run</sub> ...

Get Professional SharePoint® 2007 Web Content Management Development: Building Publishing Sites with Office SharePoint Server 2007 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.