12.3 Kernel density models

One problem with empirical distributions is that they are always discrete. If it is known that the true distribution is continuous, the empirical distribution may be viewed as a poor approximation. In this section, a method of obtaining a smooth, empirical-like distribution is introduced. Recall from Definition 11.4 that the idea is to replace each discrete piece of probability by a continuous random variable. While not necessary, it is customary that the continuous variable have a mean equal to the value of the point it replaces, ensuring that the kernel estimate has the same mean as the empirical estimate. One way to think about such a model is that it produces the final observed value in two steps. The first step is to draw a value at random from the empirical distribution. The second step is to draw a value at random from a continuous distribution whose mean is equal to the value drawn at the first step. The selected continuous distribution is called the kernel.

For notation, let p(y_j) be the probability assigned to the value y_j (j = 1, …, k) by the empirical distribution. Let K_y (x) be a distribution function for a continuous distribution such that its mean is y. Let k_y(x) be the corresponding density function.

Definition 12.2 A kernel density estimator of a distribution function is

and the estimator of the density function is

The function k_y(x

Get Loss Models: From Data to Decisions, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Loss Models: From Data to Decisions, 4th Edition by

12.3 Kernel density models

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly