Principles of the EM algorithm

Because the latent variables are not observed, the likelihood function of such a model is a marginal distribution where we have to sum out (or integrate out) the hidden variables. Marginalization will create dependencies between the variables and make the problem complex to solve.

The EM algorithm deals with this problem essentially by filling-in missing data with their expected values, given a distribution. When we iterate this process over and over, it will converge to the maximum likelihood solution. This filling-in is achieved by computing the posterior probability distribution of the hidden variables given a current set of parameters and the observed variables. This is what is done in the E-step, (E for Expectation ...

Get Learning Probabilistic Graphical Models in R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.