55Brief on Simulation and Resampling

Simulations such as we have been doing can be used much as we have been using them. Simulations are particularly useful when the assumptions that must be met in order to use formulaic methods are not met. Theoreticians who study statistical phenomena and the detailed workings of statistical methods often use simulation; they typically use more powerful tools than spreadsheet software, such as the open-source statistical toolkit R.

Resampling is analogous to the simulations we have been doing, but instead of repeatedly sampling random numbers in accordance with a preselected distribution, we would repeatedly (re)sample randomly selected values from a real data sample itself. Let's see how such an approach might work. The following example illustrates a popular approach to resampling called bootstrapping.

Say that we have a very unruly random sample of 500 incomes. We want to determine a 95% confidence interval for the population mean. We could do the following 1000 times: (1) randomly select (resample) 500 incomes from the overall sample, allowing any income to be selected any number of times, and (2) calculate the mean and save it on a list. Then, from the list of 1000 resample means, use the 25th smallest resample mean and 25th largest resample mean as the boundaries for a 95% confidence interval. Alternatively, or in addition, the standard deviation of the 1000 resample means can be used as an estimate for standard error (see Appendix C). ...

Get Illuminating Statistical Analysis Using Scenarios and Simulations now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.