Metropolis-Hastings sampling

We have seen that the full joint probability distribution of a Bayesian network P(x1, x2, x3, ..., xN) can become intractable when the number of variables is large. The problem can become even harder when it's needed to marginalize it in order to obtain, for example, P(xi), because it's necessary to integrate a very complex function. The same problem happens when applying the Bayes' theorem in simple cases. Let's suppose we have the expression p(A|B) = K · P(B|A)P(A). I've expressly inserted the normalizing constant K, because if we know it, we can immediately obtain the posterior probability; however, finding it normally requires integrating P(B|A)P(A), and this operation can be impossible in closed form.

The ...

Get Mastering Machine Learning Algorithms now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.