The effect of different priors. The blue solid line is the uniform prior we used in the last post, and the red dashed line is beta distribution with alpha and beta taking a value of 0.5, and the green dotted line is also beta distribution with alpha and beta 100. The case for two alternative priors, reflecting slightly different assumptions in the conditioning information I. The 3 numbers on the bottom right are number of heads, number of tails, and total number of tosses. For comparison purpose, I scaled them vertically, so that the maximum value is 1.

The figure above shows 3 different priors, with the red and green lines are beta distribution with different parameters. (The beta distribution is the conjugate prior to the binomial distribution, so it will make the compute easier, as the posterior will also be a beta distribution).

We can see the 3 posterior pdfs are quit different initially, and then gradually converge to the same answer. This makes sense, as when we have few data, our initial beliefs have a large effect on the posterior pdfs. With more and more data comes in, the posterior is then dominated by the likelihood function, and becomes irrelevant of the prior we chose. Also, we can see the blue and red curves are converging very fast, and it takes much more observed data for the green curve to shift to the same result. The reason lies the prior we chose. We can see from the first subplot, the green curve has a relative narrow peak around 0.5, this means that we have a strong belief that the coin is a fair coin, so it need much more observations to change our belief. That is why it takes much longer for the green curve to move to the true answer.

You can download the python script from my github: Qingkai's github

## No comments:

## Post a Comment