21.4. Noise Words

There are tons and tons of words in use in different languages (Full-Text supports more thanjust U.S. English!). Most languages have certain words that appear over and over again with little intrinsic meaning to them. In the English language, for example, prepositions (you, she, he, etc.), articles (the, a, an), and conjunctions (and, but, or) are just few examples of words that appear in many, many sentences but are not integral to the meaning of that sentence.

If SQL Server paid attention to those words, and we did searches based on them, then we would drown in the results that SQL Server gave us in our queries — quite often, every single row in the table would be returned! The solution comes in the form of what is called a noise word list. This is a list of words that SQL Server ignores when considering matches.

The noise word list is, by default, stored in a text file in the path:

Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\FTData

For U.S. English, the name of the file is noiseENU.txt. Other noise files can be found in the same subdirectory to support several other languages.

You can add and delete words from this list as suits the particular needs of your application. For example, if you are in the business of selling tractor-trailer rigs, then you might want to add words like hauling to your noise word list — more than likely, a huge percentage of your customers have that word in their name, so it is relatively unhelpful in searches. To do this, you ...

Get Professional SQL Server™ 2005 Programming now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.