What generally happens to the sampling error as the sample size is decreased?

Sampling error is a statistical error resulting from estimating a parameter, e.g., the mean of a variable of interest, in a sample rather than the population. Different samples will produce different estimates of the unknown parameter; the difference between an estimate and the true value is referred to as sampling error. This terminology is used with regard to parameter estimation in the usual (frequentist) paradigm which assumes the existence of a true, underlying parameter value.

Description

Quantification

In general, the true parameter value, and thus the magnitude of sampling error, is unknown. However, if the observed data represents a random sample from the population, the sampling error associated with the estimate of a particular value (e.g., mean, proportion, difference between means) can be predicted from the relevant theoretical sampling distribution, using the observed sample standard deviation and sample size. Theoretical sampling...

This is a preview of subscription content, access via your institution.

Buying options

Chapter

EUR 29.95

Price includes VAT (Korea(Rep.))

DOI: 10.1007/978-94-007-0753-5_2554
Chapter length: 3 pages
Instant PDF download
Readable on all devices
Own it forever
Exclusive offer for individuals only
Tax calculation will be finalised during checkout

Buy Chapter

eBookEUR 6,419.99Price includes VAT (Korea(Rep.))

ISBN: 978-94-007-0753-5
Instant PDF download
Readable on all devices
Own it forever
Exclusive offer for individuals only
Tax calculation will be finalised during checkout

Buy eBook

Hardcover BookEUR 7,499.99Price excludes VAT (Korea(Rep.))

ISBN: 978-94-007-0752-8
Dispatched in 3 to 5 business days
Exclusive offer for individuals only
Free shipping worldwide
Shipping restrictions may apply, check to see if you are impacted.
Tax calculation will be finalised during checkout

Buy Hardcover Book

Learn about institutional subscriptions

References

Biemer, P. P., & Lyberg, L. E. (2003). Introduction to survey quality. Hoboken, NJ: Wiley.
Google Scholar
Cochran, W. G. (1977). Sampling techniques (3rd ed). NY: John Wiley & Sons.
Google Scholar
Tille, Y. (2006). Sampling algorithms. New York: Springer Science + Business Media.
Google Scholar
Wackerly, D. D., Mendenhall, W. I. I. I., & Scheaffer, R. L. (2008). Mathematical statistics with applications (7th ed.). Belmont, CA: Thomson Learning.
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Clinical Epidemiology and Biostatistics, University of Newcastle, University Drive, Callaghan, NSW, 2308, Australia
Elizabeth Holliday

Authors

Elizabeth Holliday
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Elizabeth Holliday .

Editor information

Editors and Affiliations

University of Northern British Columbia, Prince George, BC, Canada
Alex C. Michalos
(residence), Brandon, MB, Canada
Alex C. Michalos

Rights and permissions

Reprints and Permissions

Copyright information

About this entry

Cite this entry

Holliday, E. (2014). Sampling Error. In: Michalos, A.C. (eds) Encyclopedia of Quality of Life and Well-Being Research. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-0753-5_2554

Sampling error is the difference between a sample statistic and the population parameter it estimates. It is a crucial consideration in inferential statistics where you use a sample to estimate the properties of an entire population.

For example, you gather a random sample of adult women in the United States, measure their heights, and obtain an average of 5’ 4” (1.63m). The sample mean (x̄) estimates the population mean (μ). However, it’s virtually guaranteed that the sample mean doesn’t equal the population parameter exactly. The difference is sampling error.

The preceding example illustrates sampling error when a sample estimates the mean. However, the same principles apply for other types of estimates, such as proportions, effect sizes, correlation, and regression coefficients.

In this post, learn about what constitutes an acceptable sampling error, factors that contribute to it, how to minimize it, and statistical tools that evaluate it.

It’s Unavoidable but Knowledge Helps Minimize It

There are tremendous benefits for working with samples. For one thing, it’s usually impossible to measure an entire population because they tend to be extremely large. Consequently, samples are the only way for most research even to proceed. Samples allow you to obtain a practical dataset with reasonable costs in a realistic timeframe.

Unfortunately, sampling error is an inherent consideration when using samples. Even when researchers conduct their study perfectly, they can’t avoid some degree of sampling error.

Why not?

Randomness alone guarantees that your sample cannot be 100% representative of the population. Chance inevitably causes some error because the probability of obtaining just the right sample that perfectly matches the population value is practically zero. Additionally, samples can never provide a perfect depiction of the population with all its nuances because it is not the entire population. Samples are typically a tiny percentage of the whole population.

The only way to prevent sampling error is to measure the entire population. Barring that approach, researchers can take steps to understand and minimize it.

Given the inevitability of some sampling error for most studies, the question becomes, how close are the sample estimates likely to be to the correct population values? The best studies tend to have low amounts of it, while subpar studies have more.

Let’s start by breaking down the properties of acceptable sampling error. Then we’ll move on to managing its sources.

Related post: Sample Statistics are Always Wrong (to Some Extent)!

Properties of Acceptable Sampling Error

In inferential statistics, the goal is to obtain a random sample from a population and use it to estimate the attributes of that population. Sample statistics are estimates of the relationships and effects in the population. Sampling error always occurs, so we have to live with it. But what do statisticians consider to be acceptable?

In a nutshell, sampling error should be unbiased and small. Let’s explore these characteristics using sampling distributions.

A key concept of inferential statistics is that the sample a researcher draws is only one of an infinite number of samples they could have drawn. Imagine we repeat a study many times. We collect many random samples from the same population and calculate each sample’s estimate. Later, we graph the distribution of the estimates. Statisticians refer to it as a sampling distribution.

The concepts of bias and precision for sampling error relate to the tendency of multiple samples to center on the proper value and to cluster tightly around it.

Learn more about Sampling Distributions.

Related posts: Inferential vs. Descriptive Statistics and Populations, Parameters, and Samples in Inferential Statistics

Unbiased Sampling Error

Unbiased sampling error tends to be right on target. While the sample estimates won’t be exactly right, they should not be systematically too high or low. The average or expected value of multiple attempts should equal the population value. Statisticians refer to this property of being correct on average as unbiased.

In the graph below, the population value is the target that the distribution should center on to be unbiased.

The curve on the right centers on a value that is too high. The methodology behind this study tends to overestimate the population parameter, which is a positive bias. It is not correct on average. Statisticians refer to this problem as sampling bias.

However, the left-hand curve centers on the correct value. That study’s procedures yield sample statistics that are correct on average—it’s unbiased. The expected value is the real population value.

Please note that larger sample sizes do not reduce bias. When a methodology produces biased results, a larger sample size simply produces a greater number of biased values.

Learn about Sampling Bias.

Sampling Error and Precision

Recognizing that sample statistics are rarely correct exactly, you want to minimize the difference between the estimate and the population parameter. Large differences are bad!

Precision in statistics assesses how close you can expect your estimate to be to the correct population value. When your study has a low sampling error, it produces precise estimates that you can confidently expect to be close to the population value. That’s a better position than having high amounts of error, producing imprecise estimates. In that scenario, you know your estimate is likely to be wrong by a significant amount!

Sampling distributions represent sampling error and precision using the width of the curves. Tighter distributions represent lower error and more precise estimates because they cluster more tightly around the population value. Conversely, broad distributions indication lower precision because estimates tend to fall further away from the correct value.

In the graph, both curves center on the correct population value, indicating they’re both unbiased. That’s good. However, the red curve is broader than the blue curve because it has more sampling error. Its estimates tend to fall further away from the population value than the blue curve. That’s not good. We want our estimates to be close to the actual population value.

Relatively precise estimates cluster more tightly around the parameter value, which you can see in the blue curve.

Unlike biased results, increasing the sample size reduces the amount of sampling error and increases precision. We’ll come back to that!

Sources of Sampling Error

As you saw above, you can understand sampling error through the bias and precision of sample statistics. Some sources of sampling error tend to produce bias, while others affect precision.

Sources of Bias

Biases in sampling error frequently occur when the sample or measurements do not accurately represent the population. These problems cause the sample statistics to be systematically higher or lower than the correct population values.

The leading causes of bias relate to the study’s procedures. There are no statistical measures that assess bias. A sample’s properties cannot tell you whether the sample itself is biased. Instead, you must look at the study’s methods and procedures to determine whether they will likely introduce bias.

Below are some of the top causes of sampling error bias:

The study did not use a representative sampling method. Hence, the study obtains a sample that misrepresents the population.
The study sampled the wrong population. The study might have used a representative sampling method but somehow drew a sample from the incorrect population.
Attrition bias occurs when subjects with particular characteristics disproportionately drop out. The sample might have mirrored the population at first but not after the attrition.
Non-response bias occurs when subjects with particular characteristics don’t respond. The sample might reflect the population, but the responses do not.
Measurement errors. For various reasons, the measurements might not reflect the actual attributes. Even if the sample is representative, the measurements might not be.

Factors that Affect Precision

Random sampling error refers to chance differences between a random sample and the population. It excludes the biases that I discuss above. This type of error affects the estimate’s precision.

Two key factors affect random sampling error, population variability and sample size.

Low variability in the population reduces the amount of random sampling error, increasing the precision of the estimates.
Larger sample sizes reduce random sampling error, producing more precise estimates.

Of these two factors, researchers usually have less control over the variability because it is an inherent property of the population. However, they can collect larger sample sizes. Consequently, increasing the sample size becomes the critical method for reducing random sampling error.

Unlike bias, statistical measures can evaluate random sampling error and incorporate it into various inferential statistics procedures.

To see how random sampling error works mathematically, read my post about the Standard Error, which I describe as the gateway from descriptive to inferential statistics. Or read about the Law of Large Numbers, a more conceptual approach to how larger samples lead to more precise estimates.

Statistical Methods that Evaluate Random Sampling Error

Inferential statistics are procedures that use sample data to draw conclusions about populations. To do so, they must incorporate sampling error in their calculations.

For example, imagine you’re studying the effectiveness of a new medication and find that it improves the health outcome in the treatment group by 10% relative to the control group. Does that effect exist in the population, or is the sample difference due to random sampling error?

Inferential procedures can help make that determination. I’ll summarize several broad types, but please click the links to learn more about them.

Hypothesis Testing: Uses sample data to evaluate two mutually exclusive hypotheses about a population. Statistically significant results suggest that the sample effect exists in the population after accounting for sampling error.
Confidence Intervals: A range of values likely to contain an unknown population parameter. This procedure assesses the sampling error and adds a margin around the estimate, providing an idea of how wrong it might be.
Margin of Error: Similar to a confidence interval but usually applies to survey results.

Please note that these procedures evaluate only random sampling error. They cannot detect bias and actually assume there is no bias. Consequently, the presence of bias invalidates their results.

Sampling error is unavoidable when you’re working with samples. However, you can minimize it and incorporate it into your results.

What happens to the sampling error as the sample size decreases?

Sampling error is affected by a number of factors including sample size, sample design, the sampling fraction and the variability within the population. In general, larger sample sizes decrease the sampling error, however this decrease is not directly proportional.

What happens to error when sample size increases?

The prevalence of sampling errors can be reduced by increasing the sample size. As the sample size increases, the sample gets closer to the actual population, which decreases the potential for deviations from the actual population.