Part IV

Density Estimation and Smoothing

There are three concepts that connect the remaining chapters in this book. First, the methods are generally nonparametric. The lack of a formal statistical model introduces computing tasks beyond straightforward parameter estimation.

Second, the methods are generally intended for description rather than formal inference. We may wish to describe the probability distribution of a random variable or estimate the relationship between several random variables.

The most interesting questions in statistics ask how one thing depends on another. The paragon of all statistical strategies for addressing this question is the concept of regression (with all its forms, generalizations, and analogs), which describes how the conditional distribution of some variables depends on the value of other variables.

The standard regression approach is parametric: One assumes an explicit, parameterized functional relationship between variables and then estimates the parameters using the data. This philosophy embraces the rigid assumption of a prespecified form for the regression function and in exchange enjoys the potential benefits of simplicity. Typically, all the data contribute to parameter estimation and hence to the global fit. The opposite trade-off is possible, however. We can reject the parametric assumptions in order to express the relationship more flexibly, but the estimated relationship can be more complex.

Generally, we will call these approaches ...

Get Computational Statistics, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.