Rational for using Beta Distribution

At each location, z, we have specified a prior probability, p(z), that a sample taken there will exceed the release criterion. Suppose now that a sample is taken at location z. This is called a Bernoulli trial. Let y(z) = 1 if a measurement at location z exceeds the release criterion, and let y(z) = 0 if it does not. Such a function is called an indicator function, and we have that image\v4ht0324.gif, for x = 0 or x = 1. This is simply a binomial probability distribution with n = 1. For n trials (samples), we haveimage\v4ht0325.gif, for x = 0,...,n.

Now suppose that we consider p(z) not as a fixed constant, but as a random variable itself. There will be a probability distribution describing how likely it is that p(z) takes on each value between zero and one, say f(p). One possible candidate for f(p) is the Beta distribution,


For various values of " and $, the Beta distribution can produce a wide variety of shapes (i.e., it is a rich family of distributions to choose from). Notice that this distribution has a form very similar to that of the binomial probability distribution. In fact, the information implied by the specification of a particular Beta distribution with parameters " and $ is the same as that which would be obtained from a series of Bernoulli trials with "-1 samples above the release criterion and $-1 samples below the release criterion. " = $ =1 corresponds to no information, and it also corresponds to a prior distribution which is uniform for p between zero and one.

The most important reason for using the Beta distribution is apparent when we apply Bayes theorem to update the prior distribution using the hard data we obtain.

Bayes theorem states that the updated probability distribution for p, given the result of the sample,

image\v4ht0327.gif, substituting in for

image\v4ht0328.gif and

image\v4ht0329.gif, we find that


That is, the updated posterior probability distribution for p is also a Beta distribution. The Beta distribution is called a conjugate prior for the binomial likelihood.

In the case of n independent samples, the outcome that x of these exceed the release criterion results in the prior distribution image\v4ht0331.gif being updated to the posterior distribution image\v4ht0332.gif. Unfortunately, this analysis assumes that the result at each point z is independent of the result at any other point, and that the probability of exceeding the release criterion is the same at all possible sampling points.