Next: Application to Validation and
Up: Estimating Discrete Probabilities
Previous: Estimating Discrete Probabilities
There are several was to compute how good an estimate this is. First,
we can put an error bar on the value. The standard way of computing
the size of the error of an experimental measurement is use what is
called the
standard error in the sample mean, which is the standard deviation of the
mean. This is the measured standard deviation divided by the square
root of the number of samples. Since this is a binomial process, either the outcome is or
it isn't, the formula for the standard deviation of a binomial
gives the formula for the standard error, which is,
Here is number of points in the sample. Thus, as the size of the
sample increases, the uncertainty of the estimate decreases like one
over the square root of the size of the sample. In the example above,
our estimate for the probability that a flipped thumbtack lands pointy
side up would be .
Another approach is to use what are called confidence
intervals. Given the estimated value of , a confidence
interval is an range of values in which the true value of is
likely to be. By ``likely'' one often means that the probability that
the true value falls in the interval is 95%. This is called the
95% confidence interval.
You might know that for a normal distribution it is expected that the
data falls within one standard error 68% of the time, and within two
standard errors about 95% of the time. One says that one has 95%
confidence that the true value is between the estimate minus two
standard errors and the estimate plus two standard errors. Now, the
distribution of the measured estimate is not normal, it is binomial,
but a normal distribution can be approximated by a normal
distribution if the value of is not too close to or . Or
one can use a binomial table. Figure shows a graph of the 95%
confidence intervals.
Figure:
Confidence intervals for a binomial variable. is the measured value; is the true value. The curves are labelled by the number of sample points.
|
In the example about, we would be 95% confident that the true value
for the probability of a flipped thumbtack landing pointy side up is
between and . If a more accurate estimate is desired, a
larger number of experiments is required. The size of the interval
will decrease with the square root of the number of experiments.
Next: Application to Validation and
Up: Estimating Discrete Probabilities
Previous: Estimating Discrete Probabilities
Jon Shapiro
1999-09-23