How to Find Confidence Intervals for a Single Proportion

In statistics, sample data is often used to help find estimates of population parameters. Common parameters that experimenters try to estimate include population means, standard deviations, and proportions. Estimates called confidence intervals are used to estimate these parameters.

 

What Is a Confidence Interval?

The sample statistics (or point estimates) – such as the mean, standard deviation, proportion, etc. – are used to make inference about a population based on a random sample from that population. The point estimate likely does not equal the population parameter it estimates, but should be close. The confidence interval is a range around the point estimate that has a specific probability of containing the population parameter, typically 0.95 for a 95% confidence interval. The confidence interval gives a better estimate of the population parameter of interest because it gives the idea of the range in which the population parameter is.

Confidence Intervals for Single Means and Standard Deviations in STATISTICA

In STATISTICA, you can use the Descriptive Statistics analysis available via the Basic Statistics module to find confidence intervals for a single mean or single standard deviation. To access this analysis, first open a data file, and then select the Statistics tab. In the Base group, click Basic Statistics.

In the Basic Statistics and Tables Startup Panel, select Descriptive Statistics and click OK to display the Descriptive Statistics dialog box. The options for the confidence intervals for the mean and standard deviation are on the Advanced tab. You can specify the confidence level for each via the respective Interval edit box.

Statistica interval edit box

You would then click the Summary button to get the requested statistics, which would include these confidence intervals.

Using STATISTICA to Find a Confidence Interval for a Single Proportion

The Descriptive Statistics analysis is useful for finding statistics regarding continuous data. Proportions are not continuous, but counts. Tools such as Frequency Tables and Tables and Banners can find proportions. You can find a confidence interval for a single proportion using the Power Analysis module. This module is often used to calculate statistical power for a given analysis or to calculate the sample size required to attain a certain power level for a given analysis, but it can also be used to calculate, for a given analysis type, specialized confidence intervals not generally available in the general-purpose statistical packages.

Confidence Interval for a Single Proportion Example

In this example, researchers took a sample of 500 randomly selected subjects who completed four years of college. They found that 75 of them smoked on a regular basis. Thus, the sample proportion (often designated as ) of people who smoked and had a four-year college education is 75/500=0.15 (or 15%). If we wanted an estimate of the true proportion (usually designated as p) of people who smoke that have a four-year education, we could construct a confidence interval for the proportion.

The simplest and most commonly used formula for this type of confidence interval relies on approximating the binomial distribution with a normal distribution (the proportion is binomial because the person sampled either smoked or did not smoke). The formula is:

statistica binomial distribution formula

where z₁-α⁄2 is the 1-α⁄2 percentile of the standard normal distribution; α is the Type I error rate and is the complement of the confidence level.  Thus, for a 95% confidence level, the error α is 5% or 0.05.

This z-score can be calculated within STATISTICA. On the Statistics tab in the Base group, click Basic Statistics to display the Basic Statistics and Tables Startup Panel. Select Probability calculator.

Statistica probability calculator

Click OK to display the Probability Distribution Calculator.

In the Distribution field, select Z (Normal). Select the Inverse, Two-tailed, and (1-cumulative p) check boxes. We are using α = 0.05, so enter this value for p. Click the Compute button to calculate the z critical value (which is given in the X edit field). It is found to be 1.959964, which is commonly rounded to 1.96.

Statistica probability distribution calculator

Thus, the confidence interval for the true proportion is 0.15-1.96*sqrt[(0.15)(0.85)/500] < p < 0.15+1.96*sqrt[(0.15)(0.85)/500]→0.11870131 < p < 0.18129869.

Finding the Confidence Interval in STATISTICA

As previously mentioned, we can find this same confidence interval for a single proportion using the Power Analysis module in STATISTICA.

With any data file opened, select the Statistics tab. In the Advanced/Multivariate group, click Power Analysis. In the Power Analysis and Interval Estimation Startup Panel, select Interval Estimation as the analysis category, and then select One Proportion, Z, Chi-Square Test as the analysis type.

Statistica Power Analysis and Interval Estimation

Click OK.

In the Single Proportion: Interval Estimation dialog box, enter 0.15 for Observed Proportion p, 500 for Sample Size (N), and 0.95 for Conf. Level.

Statistica single proportion interval estimation

Click Compute to calculate the confidence interval.

Statistica data interval estimation

The Pi (Crude) results should match what was calculated earlier by hand as these are the estimates using the normal approximation to the binomial distribution (note that the hand calculations could be off a little due to rounding the z critical value to 1.96; STATISTICA will carry this out to more decimals for better accuracy).

The results in the Interval Estimation spreadsheet also include two other ways to calculate the confidence interval for a proportion – Pi (Exact) (the confidence intervals are the “exact, Clopper-Pearson” confidence intervals) and Pi (Approximate) (the confidence intervals employ a score method with a continuity correction). For more information on how these two methods are computed, see methods 4 and 5 from Robert Newcombe’s paper, Two-Sided Confidence Intervals for the Single
Proportion: Comparison of Seven Methods
(1998, Statistics in Medicine, 17, 857-872).

Conclusion

Sometimes a researcher wants to estimate the true proportion of a population of interest by finding the confidence interval for that proportion. In STATISTICA, the Power Analysis module provides the means to find this estimate.

Advertisements

About statsoftsa

StatSoft, Inc. was founded in 1984 and is now one of the largest global providers of analytic software worldwide. StatSoft is also the largest manufacturer of enterprise-wide quality control and improvement software systems in the world, and the only company capable of supporting its QC products worldwide, with wholly owned subsidiaries in all major markets (StatSoft has 23 full-service offices, on all continents), and its software is available in more than 10 languages.

Posted on October 16, 2012, in Uncategorized. Bookmark the permalink. Leave a comment.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: