## With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

No credit card required

## Lions and tigers and bears

I’ll start with a simplified version of the problem where we know that there are exactly three species. Let’s call them lions, tigers and bears. Suppose we visit a wild animal preserve and see 3 lions, 2 tigers and one bear.

If we have an equal chance of observing any animal in the preserve, the number of each species we see is governed by the multinomial distribution. If the prevalence of lions and tigers and bears is `p_lion` and `p_tiger` and `p_bear`, the likelihood of seeing 3 lions, 2 tigers and one bear is

`p_lion**3 * p_tiger**2 * p_bear**1`

An approach that is tempting, but not correct, is to use beta distributions, as in The beta distribution, to describe the prevalence of each species separately. For example, we saw 3 lions and 3 non-lions; if we think of that as 3 “heads” and 3 “tails,” then the posterior distribution of `p_lion` is:

```    beta = thinkbayes.Beta()
beta.Update((3, 3))
print beta.MaximumLikelihood()```

The maximum likelihood estimate for `p_lion` is the observed rate, 50%. Similarly the MLEs for `p_tiger` and `p_bear` are 33% and 17%.

But there are two problems:

1. We have implicitly used a prior for each species that is uniform from 0 to 1, but since we know that there are three species, that prior is not correct. The right prior should have a mean of 1/3, and there should be zero likelihood that any species has a prevalence of 100%.

2. The distributions for each species are not independent, because the prevalences have to add up to 1. To capture this ...

## With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

No credit card required