Chapter 8. Estimation

The Estimation Game

Let’s play a game. I’ll think of a distribution, and you have to guess what it is. We’ll start out easy and work our way up.

I’m thinking of a distribution. I’ll give you two hints; it’s a normal distribution, and here’s a random sample drawn from it:

{−0.441, 1.774, −0.101, −1.138, 2.975, −2.138}

What do you think is the mean parameter, μ, of this distribution?

One choice is to use the sample mean to estimate μ. Up until now, we have used the symbol μ for both the sample mean and the mean parameter, but now to distinguish them I will use x̄ for the sample mean. In this example, x̄ is 0.155, so it would be reasonable to guess μ = 0.155.

This process is called estimation, and the statistic we used (the sample mean) is called an estimator.

Using the sample mean to estimate μ is so obvious that it is hard to imagine a reasonable alternative. But suppose we change the game by introducing outliers.

I’m thinking of a distribution. It’s a normal distribution, and here’s a sample that was collected by an unreliable surveyor who occasionally puts the decimal point in the wrong place.

{−0.441, 1.774, −0.101, −1.138, 2.975, −213.8}

Now what’s your estimate of μ? If you use the sample mean, your guess is −35.12. Is that the best choice? What are the alternatives?

One option is to identify and discard outliers, then compute the sample mean of the rest. Another option is to use the median as an estimator.

Which estimator is the best depends on the circumstances (for ...

Get Think Stats now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.