Does the population have to be normally distributed to test this hypothesis? why?

Yes, you can, for precisely the reason you give: even if the underlying population is not normally distributed, the mean (or more precisely the difference between the means) is asymptotically normal. (There are some conditions on the underlying populations that are usually satisfied in the real world, and certainly for underlying uniform distributions.)

Inhaltsverzeichnis Show

Assumptions
Formula Review
Does the population have to be normally distributed to test a hypothesis?
Does the population need to be normally distributed?
What is normal distribution in hypothesis testing?
How a hypothesis test using a t

Let's illustrate with a simulation (R code): we consider two populations, one $U[0,10]$ and the other $U[0.5,10.5]$, and a total sample size of 1000, half from each population. Here is a sample and a t-test:

nn <- 1000

draw_1 <- function(n) runif(n,0,10)
draw_2 <- function(n) runif(n,0.5,10.5)

set.seed(1)
sample_1 <- draw_1(nn/2)
sample_2 <- draw_2(nn/2)

t.test(sample_1,sample_2)

which yields

        Welch Two Sample t-test

data:  sample_1 and sample_2
t = -3.1827, df = 996.74, p-value = 0.001504
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.9387957 -0.2226748
sample estimates:
mean of x mean of y 
 4.956549  5.537284

Now, to see that the difference in means is normal enough, we simulate drawing samples and calculating means many times:

means <- replicate(1e4,{
    sample_1 <- draw_1(nn/2)
    sample_2 <- draw_2(nn/2)
    mean(sample_2)-mean(sample_1)})

hist(means)

Of course, this difference is not really normal (for one, it's bounded between -9.5 and 10.5, whereas the normal distribution is unbounded), but it's normal "enough" for the t test to work.

Last updated
Save as PDF

Page ID773

Earlier in the course, we discussed sampling distributions. Particular distributions are associated with hypothesis testing. Perform tests of a population mean using a normal distribution or a Student's $t$-distribution. (Remember, use a Student's $t$-distribution when the population standard deviation is unknown and the distribution of the sample mean is approximately normal.) We perform tests of a population proportion using a normal distribution (usually $n$ is large or the sample size is large).

If you are testing a single population mean, the distribution for the test is for means:

\[\bar{X} - N\left(\mu_{x}, \frac{\sigma_{x}}{\sqrt{n}}\right)\]

\[t_{df}\]

The population parameter is $\mu$. The estimated value (point estimate) for $\mu$ is $\bar{x}$, the sample mean.

If you are testing a single population proportion, the distribution for the test is for proportions or percentages:

\[P' - N\left(p, \sqrt{\frac{p-q}{n}}\right)\]

The population parameter is $p$. The estimated value (point estimate) for $p$ is $p′$. $p' = \frac{x}{n}$ where $x$ is the number of successes and n is the sample size.

Assumptions

When you perform a hypothesis test of a single population mean $\mu$ using a Student's $t$-distribution (often called a $t$-test), there are fundamental assumptions that need to be met in order for the test to work properly. Your data should be a simple random sample that comes from a population that is approximately normally distributed. You use the sample standard deviation to approximate the population standard deviation. (Note that if the sample size is sufficiently large, a $t$-test will work even if the population is not approximately normally distributed).

When you perform a hypothesis test of a single population mean $\mu$ using a normal distribution (often called a $z$-test), you take a simple random sample from the population. The population you are testing is normally distributed or your sample size is sufficiently large. You know the value of the population standard deviation which, in reality, is rarely known.

When you perform a hypothesis test of a single population proportion $p$, you take a simple random sample from the population. You must meet the conditions for a binomial distribution which are: there are a certain number $n$ of independent trials, the outcomes of any trial are success or failure, and each trial has the same probability of a success $p$. The shape of the binomial distribution needs to be similar to the shape of the normal distribution. To ensure this, the quantities $np$ and $nq$ must both be greater than five $(np > 5$ and $nq > 5)$. Then the binomial distribution of a sample (estimated) proportion can be approximated by the normal distribution with $\mu = p$ and $\sigma = \sqrt{\frac{pq}{n}}$. Remember that $q = 1 – p$.

Summary

In order for a hypothesis test’s results to be generalized to a population, certain requirements must be satisfied.

When testing for a single population mean:

A Student's $t$-test should be used if the data come from a simple, random sample and the population is approximately normally distributed, or the sample size is large, with an unknown standard deviation.
The normal test will work if the data come from a simple, random sample and the population is approximately normally distributed, or the sample size is large, with a known standard deviation.

When testing a single population proportion use a normal test for a single population proportion if the data comes from a simple, random sample, fill the requirements for a binomial distribution, and the mean number of successes and the mean number of failures satisfy the conditions: $np > 5$ and $nq > 5$ where $n$ is the sample size, $p$ is the probability of a success, and $q$ is the probability of a failure.

Formula Review

If there is no given preconceived $\alpha$, then use $\alpha = 0.05$.

Types of Hypothesis Tests

Single population mean, known population variance (or standard deviation): Normal test.
Single population mean, unknown population variance (or standard deviation): Student's $t$-test.
Single population proportion: Normal test.
For a single population mean, we may use a normal distribution with the following mean and standard deviation. Means: $\mu = \mu_{\bar{x}}$ and $\\sigma_{\bar{x}} = \frac{\sigma_{x}}{\sqrt{n}}$
A single population proportion, we may use a normal distribution with the following mean and standard deviation. Proportions: $\mu = p$ and $\sigma = \sqrt{\frac{pq}{n}}$.

Glossary

Binomial Distributiona discrete random variable (RV) that arises from Bernoulli trials. There are a fixed number, $n$, of independent trials. “Independent” means that the result of any trial (for example, trial 1) does not affect the results of the following trials, and all trials are conducted under the same conditions. Under these circumstances the binomial RV Χ is defined as the number of successes in $n$ trials. The notation is: $X \sim B(n, p) \mu = np$ and the standard deviation is $\sigma = \sqrt{npq}$. The probability of exactly $x$ successes in $n$ trials is $P(X = x) = \binom{n}{x} p^{x}q^{n-x}$.Normal Distributiona continuous random variable (RV) with pdf $f(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{\frac{-(x-\mu)^{2}}{2\sigma^{2}}}$, where $\mu$ is the mean of the distribution, and $\sigma$ is the standard deviation, notation: $X \sim N(\mu, \sigma)$. If $\mu = 0$ and $\sigma = 1$, the RV is called the standard normal distribution. Standard Deviationa number that is equal to the square root of the variance and measures how far data values are from their mean; notation: $s$ for sample standard deviation and $\sigma$ for population standard deviation.Student's t-Distributioninvestigated and reported by William S. Gossett in 1908 and published under the pseudonym Student. The major characteristics of the random variable (RV) are:

It is continuous and assumes any real values.
The pdf is symmetrical about its mean of zero. However, it is more spread out and flatter at the apex than the normal distribution.
It approaches the standard normal distribution as $n$ gets larger.
There is a "family" of $t$-distributions: every representative of the family is completely defined by the number of degrees of freedom which is one less than the number of data items.

Does the population have to be normally distributed to test a hypothesis?

It must be approximately normally distributed. You are performing a hypothesis test of a single population mean using a Student's t-distribution. The data are not from a simple random sample.

Does the population need to be normally distributed?

No because the Central Limit Theorem states that regardless of the shape of the underlying population, the sampling distribution of x-bar becomes approximately normal as the sample size, n, increases.

What is normal distribution in hypothesis testing?

A hypothesis test formally tests if the population the sample represents is normally-distributed. The null hypothesis states that the population is normally distributed, against the alternative hypothesis that it is not normally-distributed.

How a hypothesis test using a t

Like a standard normal distribution (or z-distribution), the t-distribution has a mean of zero. The normal distribution assumes that the population standard deviation is known. The t-distribution does not make this assumption. The t-distribution is defined by the degrees of freedom.

Does the population have to be normally distributed to test this hypothesis? why?

Assumptions

Summary

Formula Review

Glossary

Does the population have to be normally distributed to test a hypothesis?

Does the population need to be normally distributed?

What is normal distribution in hypothesis testing?

How a hypothesis test using a t

zusammenhängende Posts

What term refers to the process of choosing a representative part of a population quizlet?

Which u.s. ethnic population is declining, relative to other ethnic populations?

A new pesticide has been introduced to the habitat of this population of beetles.

Which of the following describes a population in terms of its size, distribution, and structure?

In a healthy population, there should be only young members represented true False

Which of the following is the largest central american country in terms of population?

Given the following population data sets, which has a smaller population variance?

What portion of the Souths white population has no proprietary interest in slaves?

During industrialization, rapid increases in europes population were alleviated by

Forest managers harvest at this population size to obtain maximum sustainable yield

Werbung

NEUESTEN NACHRICHTEN

Wie bekommt man einen Knutschfleck schnell wieder weg?

Warum kann ich meine Homepage nicht öffnen?

Abrechnung mastercard wer ist zuständig

Which of the following describe Accenture people choose every correct answer

Mobiles Datennetzwerk konnte nicht aktiviert werden Ausland

Wer stirbt in Staffel 8 Folge 24 Greys Anatomy?

Wie lange braucht leber um sich vom alkohol zu erholen

Is a planned activity at a special event that is conducted for the benefit of an audience.

Welche Spiele kann man mit PC und PS4 zusammen spielen?

Was tun wenn baby erstickt

Werbung

Populer

Werbung

Um

Legal

Hilfe

Sozial