Construct and interpret a 95% confidence interval for the population proportion

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

When a characteristic being measured is categorical — for example, opinion on an issue (support, oppose, or are neutral), gender, political party, or type of behavior (do/don’t wear a seatbelt while driving) — most people want to estimate the proportion (or percentage) of people in the population that fall into a certain category of interest.

For example, consider the percentage of people in favor of a four-day work week, the percentage of Republicans who voted in the last election, or the proportion of drivers who don’t wear seat belts. In each of these cases, the object is to estimate a population proportion, p, using a sample proportion, ρ, plus or minus a margin of error. The result is called a confidence interval for the population proportion, p.

The formula for a CI for a population proportion is

is the sample proportion, n is the sample size, and z* is the appropriate value from the standard normal distribution for your desired confidence level. The following table shows values of z* for certain confidence levels.

z*-values for Various Confidence LevelsConfidence Levelz*-value80%1.2890%1.645 (by convention)95%1.9698%2.3399%2.58To calculate a CI for a population proportion:
  1. Determine the confidence level and find the appropriate z*-value.

    Refer to the above table for z*-values.

  2. Find the sample proportion, ρ, by dividing the number of people in the sample having the characteristic of interest by the sample size (n).

    Note: This result should be a decimal value between 0 and 1.

  3. Multiply ρ(1 - ρ) and then divide that amount by n.

  4. Take the square root of the result from Step 3.

  5. Multiply your answer by z*.

    This step gives you the margin of error.

  6. Take ρ plus or minus the margin of error to obtain the CI; the lower end of the CI is ρ minus the margin of error, and the upper end of the CI is ρ plus the margin of error.

The formula shown in the above example for a CI for p is used under the condition that the sample size is large enough for the Central Limit Theorem to be applied and allow you to use a z*-value, which happens in cases when you are estimating proportions based on large scale surveys. For small sample sizes, confidence intervals for the proportion are typically beyond the scope of an intro statistics course.

For example, suppose you want to estimate the percentage of the time (with 95% confidence) you’re expected to get a red light at a certain intersection. Suppose you take a random sample of 100 different trips through this intersection and you find that a red light was hit 53 times.
  1. Because you want a 95 percent confidence interval, your z*-value is 1.96.

  2. The red light was hit 53 out of 100 times. So ρ = 53/100 = 0.53.

  3. Find

  4. Take the square root to get 0.0499.

    The margin of error is, therefore, plus or minus 1.96 ∗ 0.0499 = 0.0978, or 9.78%.

  5. Your 95 percent confidence interval for the percentage of times you will ever hit a red light at that particular intersection is 0.53 (or 53 percent), plus or minus 0.0978 (rounded to 0.10 or 10%).

    (The lower end of the interval is 0.53 – 0.10 = 0.43 or 43 percent; the upper end is 0.53 + 0.10 = 0.63 or 63 percent.)

    To interpret these results within the context of the problem, you can say that with 95 percent confidence the percentage of the times you should expect to hit a red light at this intersection is somewhere between 43 percent and 63 percent, based on your sample. You might want to try a different route!

    Confidence intervals can be used to estimate several population parameters. One type of parameter that can be estimated using inferential statistics is a population proportion. For example, we may want to know the percentage of the U.S. population who supports a particular piece of legislation. For this type of question, we need to find a confidence interval.

    In this article, we will see how to construct a confidence interval for a population proportion, and examine some of the theory behind this.

    Overall Framework

    We begin by looking at the big picture before we get into the specifics. The type of confidence interval that we will consider is of the following form:

    Estimate +/- Margin of Error

    This means that there are two numbers that we will need to determine. These values are an estimate for the desired parameter, along with the margin of error.

    Conditions

    Before conducting any statistical test or procedure, it is important to make sure that all of the conditions are met. For a confidence interval for a population proportion, we need to make sure that the following hold:

    • We have a simple random sample of size n from a large population
    • Our individuals have been chosen independently of one another.
    • There are at least 15 successes and 15 failures in our sample.

    If the last item is not satisfied, then it may be possible to adjust our sample slightly and to use a plus-four confidence interval. In what follows, we will assume that all of the above conditions have been met.

    Sample and Population Proportions

    We start with the estimate for our population proportion. Just as we use a sample mean to estimate a population mean, we use a sample proportion to estimate a population proportion. The population proportion is an unknown parameter. The sample proportion is a statistic. This statistic is found by counting the number of successes in our sample and then dividing by the total number of individuals in the sample.

    The population proportion is denoted by p and is self-explanatory. The notation for the sample proportion is a little more involved. We denote a sample proportion as p̂, and we read this symbol as "p-hat" because it looks like the letter p with a hat on top.

    This becomes the first part of our confidence interval. The estimate of p is p̂.

    Sampling Distribution of Sample Proportion

    To determine the formula for the margin of error, we need to think about the sampling distribution of p̂. We will need to know the mean, the standard deviation, and the particular distribution that we are working with.

    The sampling distribution of p̂ is a binomial distribution with probability of success p and n trials. This type of random variable has a mean of p and standard deviation of (p(1 - p)/n)0.5. There are two problems with this.

    The first problem is that a binomial distribution can be very tricky to work with. The presence of factorials can lead to some very large numbers. This is where the conditions help us. As long as our conditions are met, we can estimate the binomial distribution with the standard normal distribution.

    The second problem is that the standard deviation of p̂ uses p in its definition. The unknown population parameter is to be estimated by using that very same parameter as a margin of error. This circular reasoning is a problem that needs to be fixed.

    The way out of this conundrum is to replace the standard deviation with its standard error. Standard errors are based upon statistics, not parameters. A standard error is used to estimate a standard deviation. What makes this strategy worthwhile is that we no longer need to know the value of the parameter p.

    Formula

    To use the standard error, we replace the unknown parameter p with the statistic p̂. The result is the following formula for a confidence interval for a population proportion:

    p̂ +/- z* (p̂(1 - p̂)/n)0.5.

    Here the value of z* is determined by our level of confidence C. For the standard normal distribution, exactly C percent of the standard normal distribution is between -z* and z*. Common values for z* include 1.645 for 90% confidence and 1.96 for 95% confidence.

    Example

    Let's see how this method works with an example. Suppose that we wish to know with 95% confidence the percent of the electorate in a county that identifies itself as Democratic. We conduct a simple random sample of 100 people in this county and find that 64 of them identify as a Democrat.

    We see that all of the conditions are met. The estimate of our population proportion is 64/100 = 0.64. This is the value of the sample proportion p̂, and it is the center of our confidence interval.

    The margin of error is comprised of two pieces. The first is z*. As we said, for 95% confidence, the value of z* = 1.96.

    The other part of the margin of error is given by the formula (p̂(1 - p̂)/n)0.5. We set p̂ = 0.64 and calculate = the standard error to be (0.64(0.36)/100)0.5 = 0.048.

    We multiply these two numbers together and obtain a margin of error of 0.09408. The end result is:

    0.64 +/- 0.09408,

    or we can rewrite this as 54.592% to 73.408%. Thus we are 95% confident that the true population proportion of Democrats is somewhere in the range of these percentages. This means that in the long run, our technique and formula will capture the population proportion of 95% of the time.

    Related Ideas

    There are a number of ideas and topics that are connected to this type of confidence interval. For instance, we could conduct a hypothesis test pertaining to the value of the population proportion. We could also compare two proportions from two different populations.

    Cite this Article

    Format

    Your Citation

    Taylor, Courtney. "How to Construct a Confidence Interval for a Population Proportion." ThoughtCo. //www.thoughtco.com/confidence-interval-for-a-population-proportion-4045770 (accessed December 30, 2022).

    How to construct a 95 confidence interval for the population proportion?

    To calculate the confidence interval, we must find p′, q′. p′ = 0.842 is the sample proportion; this is the point estimate of the population proportion. Since the requested confidence level is CL = 0.95, then α = 1 – CL = 1 – 0.95 = 0.05 ( α 2 ) ( α 2 ) = 0.025.

    What is the 95 confidence interval for the proportion?

    Confidence Intervals for a proportion:.

    How would you interpret a 95% confidence interval for the mean?

    A 95% confidence interval (CI) of the mean is a range with an upper and lower number calculated from a sample. Because the true population mean is unknown, this range describes possible values that the mean could be.

Toplist

Neuester Beitrag

Stichworte