7.1 Parametric distribution estimation
7.1.1 Maximum likelihood estimation
We consider a family of probability distributions on Rm, indexed by a vector x ∈ Rn, with densities px(·). When considered as a function of x, for fixed y ∈ Rm, the function px(y) is called the likelihood function. It is more convenient to work with its logarithm, which is called the log-likelihood function, and denoted l:
There are often constraints on the values of the parameter x, which can represent prior knowledge about x, or the domain of the likelihood function. These constraints can be explicitly given, or incorporated into ...