III.2.4. Client Statistics: Helping the Optimizer Do Its Job

The Optimizer uses statistics to help it create estimated query plans. Statistics within SQL Server work the same way as they do in other applications.

For example, consider an orange buyer purchasing a truckload of oranges. The price of oranges varies depending on how sweet or how acidic the oranges are, so the first thing she needs to do is determine how sweet the oranges are. She can do this in one of two ways:

  • Take a sample from every orange on the truck and measure the sweetness and acidity.

  • Remove a statistical sampling of oranges and measure the sweetness and acidity of the sample.

Bet that the latter choice is picked. By calculating the average and deviation of the sample of oranges, accurate predictions of the entire truckload of oranges can be made.

The Optimizer uses statistics when trying to determine the usefulness of available indexes. When determining the usefulness of an index, the Optimizer must determine two things:

  • What's the density of this index for this query? A low density is desired.

  • What's the selectivity of this index for this query? A high selectivity is desired.

III.2.4.1. Understanding the density of an index

Density refers to the number of duplicate rows in a column.

Consider a business that does business in Virginia Beach, Virginia. More than 90 percent of the customers live in Virginia Beach. The other 10 percent of the customers live in nearby cities or are tourists from around the country. ...

Get Microsoft® SQL Server™ 2008 All-In-One Desk Reference For Dummies® now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.