Uncertainty arises in the estimation of the true or population values of data statistics such as the mean, standard deviation, and proportions or percentiles because a relatively small data sample is typically used to obtain an estimate of the statistic, which is presumed to apply to the entire data population. Data sample sizes are usually small due obviously to the cost of collecting the samples and having them analyzed in the laboratory to obtain the desired information. For example, suppose 10 soil samples randomly collected from a contaminated site are sent to a laboratory to analyze for benzene, a carcinogenic chemical. After analysis, the laboratory provides us with 10 benzene measurements from the 10 samples, which now constitute our benzene *data sample*, and from which we obtain, say, 18.5 mg/kg as the arithmetic mean benzene concentration. Suppose two other sets of 10 randomly collected soil samples from the same site produce mean benzene concentrations of 5.2 and 25.9 mg/kg, respectively. Which of these three results could possibly represent the *true* mean benzene concentration for the entire benzene *data population* at the site?

To further complicate the situation, the screening level (i.e., allowable concentration) for soil benzene at commercial sites is 5.4 mg/kg. This means that if the true mean benzene concentration at the site is 5.2 mg/kg, there would ...

Start Free Trial

No credit card required