What is the critical value for a one tailed hypothesis test in which a null hypothesis is tested at the 5% level of significance based on a sample size of 25?

Introduction to Inferential Statistics

Oliver C. Ibe, in Fundamentals of Applied Probability and Random Processes (Second Edition), 2014

9.4.3 One-Tailed and Two-Tailed Tests

Hypothesis tests are classified as either one-tailed (also called one-sided) tests or two-tailed (also called two-sided) tests. One tailed-tests are concerned with one side of a statistic, such as “the mean is greater than 10” or “the mean is less than 10.” Thus, one-tailed tests deal with only one tail of the distribution, and the z-score is on only one side of the statistic.

Two-tailed tests deal with both tails of the distribution, and the z-score is on both sides of the statistic. For example, Figure 9.4 illustrates a two-tailed test. A hypothesis like “the mean is not equal to 10” involves a two-tailed test because the claim is that the mean can be less than 10 or it can be greater than 10. Table 9.3 shows the critical values, zα, for both the one-tailed test and the two-tailed test in tests involving the normal distribution.

Table 9.3. Critical Points for Different Levels of Significance

Level of Significance (α) 0.10 0.05 0.01 0.005 0.002
zα for 1-Tailed Tests − 1.28 or 1.28 − 1.645 or 1.645 − 2.33 or 2.33 − 2.58 or 2.58 − 2.88 or 2.88
zα for 2-Tailed Tests − 1.645 and 1.645 − 1.96 and 1.96 − 2.58 and 2.58 − 2.81 and 2.81 − 3.08 and 3.08

In a one-tailed test, the area under the rejection region is equal to the level of significance, α. Also, the rejection region can be below (i.e., to the left of) the acceptance region or beyond (i.e., to the right of) the acceptance region depending on how H1 is formulated. When the rejection region is below the acceptance region, we say that it is a left-tail test. Similarly, when the rejection region is above the acceptance region, we say that it is a right-tail test.

In the two-tailed test, there are two critical regions, and the area under each region is α/2. As stated earlier, the two-tailed test is illustrated in Figure 9.4. Figure 9.5 illustrates the rejection region that is beyond the acceptance region for the one-tailed test, or more specifically the right-tail test.

What is the critical value for a one tailed hypothesis test in which a null hypothesis is tested at the 5% level of significance based on a sample size of 25?

Figure 9.5. Critical Region for One-Tailed Tests

Note that in a one-tailed test, when H1 involves values that are greater than μX, we have a right-tail test. Similarly, when H1 involves values that are less than μX, we have a left-tail test. For example, an alternative hypothesis of the type H1 : μX > 100 is a right-tail test while an alternative hypothesis of the type H1 : μX < 100 is a left-tail test. Figure 9.6 is a summary of the different types of tests. In the figure, μ0 is the current value of the parameter.

What is the critical value for a one tailed hypothesis test in which a null hypothesis is tested at the 5% level of significance based on a sample size of 25?

Figure 9.6. Summary of the Different Tests

Example 9.9

The mean lifetime E[X] of the light bulbs produced by Lighting Systems Corporation is 1570 hours with a standard deviation of 120 hours. The president of the company claims that a new production process has led to an increase in the mean lifetimes of the light bulbs. If Joe tested 100 light bulbs made from the new production process and found that their mean lifetime is 1600 hours, test the hypothesis that E[X] is not equal to 1570 hours using a level of significance of (a) 0.05 and (b) 0.01.

Solution:

The null hypothesis is

H0:μX=1570hours

Similarly, the alternative hypothesis is

H1:μX≠1570hours

Since μX ≠ 1570 includes numbers that are both greater than and less than 1570, this is a two-tailed test. From the available data, the normalized value of the sample mean is

z=X¯−μXσX¯=X¯−μXσX /n=1600−1570120/100=3012=2.50

a.

At a level of significance of 0.05, zα = − 1.96 and zα = 1.96 for a two-tailed test. Thus, our acceptance region is [− 1.96, 1.96] of the standard normal distribution. The rejection and acceptance regions are illustrated in Figure 9.7.

What is the critical value for a one tailed hypothesis test in which a null hypothesis is tested at the 5% level of significance based on a sample size of 25?

Figure 9.7. Critical Region for Problem 98.9(a)

Since z = 2.50 lies outside the range [− 1.96, 1.96] (that is, it is in a rejection region), we reject H0 at the 0.05 level of significance and accept H1, which means that the difference in mean lifetimes is statistically significant.

b.

At the 0.01 level of significance, zα = − 2.58 and zα = 2.58. The acceptance and rejection regions are shown in Figure 9.8. Since z = 2.50 lies within the range [− 2.58, 2.58], which is the acceptance region, we accept H0 at the 0.01 level of significance, which means that the difference in mean lifetimes is not statistically significant.

What is the critical value for a one tailed hypothesis test in which a null hypothesis is tested at the 5% level of significance based on a sample size of 25?

Figure 9.8. Critical Region for Problem 9.9(b)

Example 9.10

For Example 9.9, test the hypothesis that the new mean lifetime is greater than 1570 hours using a level of significance of (a) 0.05 and (b) 0.01.

Solution:

Here we define the null hypothesis and alternative hypothesis as follows:

H0:μX=1570hoursH1:μX> 1570hours

This is a one-tailed test. Since the z-score is the same as in Example 9.9, we only need to find the confidence limits for the two cases.

a.

Because H1 is concerned with values that are greater than 1570, we have a right-tail test, which means that we choose the rejection region that is above the acceptance region. Therefore, we choose zα = 1.645 for the 0.05 level of significance in Table 9.3. Since z = 2.50 lies in the rejection region (i.e., 2.50 > 1.645), as illustrated in Figure 9.9, we reject H0 at the 0.05 level of significance and thus accept H1. This implies that the difference in mean lifetimes is statistically significant.

What is the critical value for a one tailed hypothesis test in which a null hypothesis is tested at the 5% level of significance based on a sample size of 25?

Figure 9.9. Critical Region for Problem 8.10(a)

b.

From Table 9.3, zα = 2.33 at the 0.01 level of significance, which is less than z = 2.50. Thus, we also reject H0 at the 0.01 level of significance and accept H1.

Note that we had earlier accepted H0 under the two-tailed test scheme at the 0.01 level of significance in Example 9.9. This means that decisions made under a one-tailed test do not necessarily agree with those made under a two-tailed test.

Example 9.11

A manufacturer of a migraine headache drug claimed that the drug is 90% effective in relieving migraines for a period of 24 hours. In a sample of 200 people who have migraine headache, the drug provided relief for 160 people for a period of 24 hours. Determine whether the manufacturer’s claim is legitimate at the 0.05 level of significance.

Solution:

Since the success probability of the drug is p = 0.9, the null hypothesis is

H0:p=0.9

Also, since the drug is either effective or not, testing the drug on any individual is essentially a Bernoulli trial with claimed success probability of 0.9. Thus, the variance of the trial is

σp2=p 1−p=0.09

Because the drug provided relief for only 160 of the 200 people tested, the observed success probability is

p¯ =160200=0.8

We are interested in determining whether the proportion of people that the drug was effective in relieving their migraines is too low. Since p¯<0.9, we choose the alternative hypothesis as follows:

H1:p<0.9

Thus, we have a left-tail test. Now, the standard normal score of the observed proportion is given by

z=p¯−pσp¯=p¯−pσp/n =0.8−0.90.09/200=−0.10.0212=−4.72

For a left-tail test at the 0.05 level of significance, the critical value is zα = − 2.33. Since z = − 4.72 falls within the rejection region, we reject H0 and accept H1; that is, the company’s claim is false.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128008522000092

Multiple Regression

Rudolf J. Freund, ... Donna L. Mohr, in Statistical Methods (Third Edition), 2010

8.3.5 The Equivalent t Statistic for Individual Coefficients

We noted in Chapter 7 that the F test for the hypothesis that the coefficient is zero can be performed by an equivalent t test. The same relationship holds for the individual partial coefficients in the multiple regression model. The t statistic for testing H0:βj=0 is

t=β^jcjj MSE,

where cjj is the j th diagonal element of C, and the degrees of freedom are (n−m−1). It is easily verified that these statistics are the square roots of the F values obtained earlier and they will not be reproduced here. As in simple linear regression, the denominator of this expression is the standard error (or square root of the variance) of the estimated coefficient, which can be used to construct confidence intervals for the coefficients.

In Chapter 7 we noted that the use of the t statistic allowed us to test for specific (nonzero) values of the parameters, and allowed the use of one-tailed tests and the calculation of confidence intervals. For these reasons, most computers provide the standard errors and t tests. A typical computer output for Example 8.2 is shown in Table 8.6. We can use this output to compute the confidence intervals for the coefficients in the regression equation as follows:

age: Std.error=(0.0001293)(306.09)=0.199

0.95 Confidence interval: −0.3498±(2.0141)(0.19 9): from −0.7506 to 0.051

bed: Std.error= (0.64025)(306.09)=4 .427 0.95 Confidence interval: −11.2382±(2.0141)(4.427): from −20.1546 to −2.3218

bath: Std.error=(0.131 435)(306.09)=6.343 0.95 Confidence interval: −4.5401±(2.0141)(6.343): from −17.3155 to 8.2353

size: Std.error=(0.132834)(306.09)=6.376 0.95 Confidence interval: 65.9465±(2.0141)(6.376): from 53.10 45 to 78.7884

lot: Std.error=(8.234189 E − 6)(306.09 )=0.0502 0.95 Confidence interval: 0.06205±(2.0141)(0.0502): from 0.0391 to 0.1632.

As expected, the confidence intervals of those coefficients deemed statistically significant at the 0.05 level do not include zero.

Finally, note that the tests we have presented are special cases of tests for any linear function of parameters. For example, we may wish to test

H0:β4−10β5=0,

which for the home price data tests the hypothesis that the size coefficient is ten times larger than the lot coefficient. The methodology for these more general hypothesis tests is presented in Section 11.7.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780123749703000081

Multiple Regression

Donna L. Mohr, ... Rudolf J. Freund, in Statistical Methods (Fourth Edition), 2022

8.3.5 The Equivalent t Statistic for Individual Coefficients

We noted in Chapter 7 that the F test for the hypothesis that the coefficient is zero can be performed by an equivalent t test. The same relationship holds for the individual partial coefficients in the multiple regression model. The t statistic for testing H0:βj=0 is

t=βˆjcjj MSE,

where cjj is the jth diagonal element of C, and the degrees of freedom are (n−m−1). It is easily verified that these statistics are the square roots of the F values obtained earlier and they will not be reproduced here. As in simple linear regression, the denominator of this expression is the standard error (or square root of the variance) of the estimated coefficient, which can be used to construct confidence intervals for the coefficients.

In Chapter 7 we noted that the use of the t statistic allowed us to test for specific (nonzero) values of the parameters, and allowed the use of one-tailed tests and the calculation of confidence intervals. For these reasons, most computers provide the standard errors and t tests. A typical computer output for Example 8.2 is shown in Table 8.6. We can use this output to compute the confidence intervals for the coefficients in the regression equation as follows:

age: Std. error=( 0.0001293)(306.09)=0.199

0.95 Confidence interval: −0.3498±(2.0141)(0.199): from −0.7506 to 0.051,

bed: Std. error=(0.64025)(306.09)=4.427 0.95 Confidence interval: −11.2382±(2.0141)( 4.427): from −20.1546 to −2.3218,

bath: Std. error=( 0.131435)(306.09)=6.343 0.95 Confidence interval: −4.5401±(2.0141)(6.343): from −17.3155 to 8.2353,

size: Std. error=(0.132834)(306.09)=6.376 0.95 Confidence interval: 65.9465±(2.0141)(6.376): from 53.1045 to 78.7884, and

lot: Std. error=(8.234189 E−6)(306.09)=0.0502 0.95 Confidence interval: 0.06205±(2.0141)(0.0502): from 0.0391 to 0.1632.

As expected, the confidence intervals of those coefficients deemed statistically significant at the 0.05 level do not include zero.

Finally, note that the tests we have presented are special cases of tests for any linear function of parameters. For example, we may wish to test

H0:β4−10β5 =0,

which for the home price data tests the hypothesis that the size coefficient is ten times larger than the lot coefficient. The methodology for these more general hypothesis tests is presented in Section 11.7.

Example 8.3

Snow Geese Departure Times Revisited

Example 7.3 provided a regression model to explain how the departure times (TIME) of lesser snow geese were affected by temperature (TEMP). Although the results were reasonably satisfactory, it is logical to expect that other environmental factors affect departure times.

Solution

Since information on other factors was also collected, we can propose a multiple regression model with the following additional environmental variables:

HUM, the relative humidity,

LIGHT, light intensity, and

CLOUD, percent cloud cover.

The data are given in Table 8.4.

Table 8.4. Snow goose departure times data.

DATETIMETEMPHUMLIGHTCLOUD
11/10/87 11 11 78 12.6 100
11/13/87 2 11 88 10.8 80
11/14/87 −2 11 100 9.7 30
11/15/87 −11 20 83 12.2 50
11/17/87 −5 8 100 14.2 0
11/18/87 2 12 90 10.5 90
11/21/87 −6 6 87 12.5 30
11/22/87 22 18 82 12.9 20
11/23/87 22 19 91 12.3 80
11/25/87 21 21 92 9.4 100
11/30/87 8 10 90 11.7 60
12/05/87 25 18 85 11.8 40
12/14/87 9 20 93 11.1 95
12/18/87 7 14 92 8.3 90
12/24/87 8 19 96 12.0 40
12/26/87 18 13 100 11.3 100
12/27/87 −14 3 96 4.8 100
12/28/87 −21 4 86 6.9 100
12/30/87 −26 3 89 7.1 40
12/31/87 −7 15 93 8.1 95
01/02/88 −15 15 43 6.9 100
01/03/88 −6 6 60 7.6 100
01/04/88 −23 5 . 8.8 100
01/05/88 −14 2 92 9.0 60
01/06/88 −6 10 90 . 100
01/07/88 −8 2 96 7.1 100
01/08/88 −19 0 83 3.9 100
01/10/88 −23 −4 88 8.1 20
01/11/88 −11 −2 80 10.3 10
01/12/88 5 5 80 9.0 95
01/14/88 −23 5 61 5.1 95
01/15/88 −7 8 81 7.4 100
01/16/88 9 15 100 7.9 100
01/20/88 −27 5 51 3.8 0
01/21/88 −24 −1 74 6.3 0
01/22/88 −29 −2 69 6.3 0
01/23/88 −19 3 65 7.8 30
01/24/88 −9 6 73 9.5 30

An inspection of the data shows that two observations have missing values (denoted by “.”) for a variable. This means that these observations cannot be used for the regression analysis. Fortunately, most computer programs recognize missing values and will automatically ignore such observations. Therefore all calculations in this example will be based on the remaining 36 observations.

The first step is to compute X′X and X′Y . We then compute the inverse and the estimated coefficients. As before, we will let the computer do this with the results given in Table 8.5 in the same format as that of Table 8.3.

Table 8.5. Regression matrices for snow goose departure times.

Model Crossproducts X′X X′Y Y′Y
X′XINTERCEPTEMPHUM
INTERCEP 36 319 3007
TEMP 319 4645 27519
HUM 3007 27519 257927
LIGHT 326.2 3270.3 27822
CLOUD 2280 23175 193085
TIME −157 1623 −9662
X′X LIGHT CLOUD TIME
INTERCEP 326.2 2280 −157
TEMP 3270.3 23175 1623
HUM 27822 193085 −9662
LIGHT 3211.9 20079.5 −402.8
CLOUD 20079.5 194100 −3730
TIME −402.8 −3730 9097
X′X Inverse, Parameter Estimates, and SSE
INTERCEPT TEMP HUM
INTERCEP 1.1793413621 0.0085749149 −0.010464297
TEMP 0.0085749149 0.0010691752 0.0000605688
HUM −0.010464297 0.0000605688 0.0001977643
LIGHT −0.028115838 −0.00192403 −0.000581237
CLOUD −0.001558842 −0.000089595 −0.000020914
TIME −52.99392938 0.9129810924 0.1425316971
LIGHT CLOUD TIME
INTERCEP −0.028115838 −0.001558842 −52.99392938
TEMP −0.00192403 −0.000089595 0.9129810924
HUM −0.000581237 −0.000020914 0.1425316971
LIGHT 0.0086195605 0.0002464973 2.5160019069
CLOUD 0.0002464973 0.0000294652 0.0922051991
TIME 2.5160019069 0.0922051991 2029.6969929

The five elements in the last column, labeled TIME, of the inverse portion contain the estimated coefficients, providing the equation:

TIMEˆ=−52.994+0.9130(TEMP)+0.1425(HUM)+2.5160( LIGHT)+0.0922(CLOUD).

Unlike the case of the regression involving only TEMP, the intercept now has no real meaning since zero values for HUM and LIGHT cannot exist. The remainder of the coefficients are positive, indicating later departure times for increased values of TEMP, HUM, LIGHT, and CLOUD. Because of the different scales of the independent variables, the relative magnitudes of these coefficients have little meaning and also are not indicators of relative statistical significance.

Note that the coefficient for TEMP is 0.9130 in the multiple regression model, while it was 1.681 for the simple linear regression involving only the TEMP variable. In this case, the so-called total coefficient for the simple linear regression model includes the indirect effect of other variables, while in the multiple regression model, the coefficient measures only the effect of TEMP by holding constant the effects of other variables.

For the second step we compute the partitioning of the sums of squares. The residual sum of squares

SSE=∑​y2−Bˆ' X'Y=9097−[(−52.994)(−157)+(0.9123)(1623)+(0.1425)(−9662)+(2.5160)(−402.8)+(0.09221)(−3730)],

which is available in the computer output as the last element of the inverse portion and is 2029.70. The estimated variance is MSE=2029.70∕(36−5)=65.474 , and the estimated standard deviation is 8.092. This value is somewhat smaller than the 9.96 obtained for the simple linear regression involving only TEMP.

The model sum of squares is

SSR( regression model)=Bˆ′X′ Y−∑y2/n=7067.30−684.69=6382.61.

The degrees of freedom for this sum of squares is 4; hence the model mean square is 6382.61∕4=1595.65. The resulting F statistic is 1595.65∕65.474=24.371, which clearly leads to the rejection of the null hypothesis of no regression. These results are summarized in an analysis of variance table shown in Table 8.7 in Section 8.5.

In the final step we use the standard errors and t statistics for inferences on the coefficients. For the TEMP coefficient, the estimated variance of the estimated coefficient is

varˆ(βˆTEMP) =cTEMP,TEMPMSE=(0.001069)(65.474) =0.0700,

which results in an estimated standard error of 0.2646. The t statistic for the null hypothesis that this coefficient is zero is

t=0.9130∕0.2646=3.451.

Assuming a desired significance level of 0.05, the hypothesis of no temperature effect is clearly rejected. Similarly, the t statistics for HUM, LIGHT, and CLOUD are 1.253, 3.349, and 2.099, respectively. When compared with the tabulated two-tailed 0.05 value for the t distribution with 31 degrees of freedom of 2.040, the coefficient for HUM is not significant, while LIGHT and CLOUD are. The p values are shown later in Table 8.7, which presents computer output for this problem. Basically this means that departure times appear to be affected by increasing levels of temperature, light, and cloud cover, but there is insufficient evidence to state that adding humidity to this list would improve the prediction of departure times.

We have presented the calculations in detail so the reader can see that the answers are not “magic” but are in fact the consequence of the normal equations and their solutions. Fortunately, statistical software performs these calculations for us, as shown in Section 8.5.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128230435000084

Nonparametric Statistics

Kandethody M. Ramachandran, Chris P. Tsokos, in Mathematical Statistics with Applications in R (Third Edition), 2021

12.4.1 Median test

Let m1 and m2 be the medians of two populations 1 and 2, respectively, both with continuous distributions. Assume that we have a random sample of size n1 from population 1 and a random sample of size n2 from population 2. The median test can be summarized as follows.

Hypothesis-Testing Procedure Using Median Test

We test

H0:m1=m2versusHa:m1 >m2,upper m1<m2, lower tailed testm1≠m2,two-tailed test.

1.

Combine the two samples into a single sample of size n = n1 + n2, keeping track of each observation's original population. Arrange the n1 + n2 observations in increasing order and find the median of this combined sample. If the median is one of the sample values, discard those observations and adjust the sample size accordingly.

2.

Define N1b to be the number of observations of a sample from population 1.

3.

Decision: If H0 is true, then we would expect N1b to be equal to some number around n1/2. For Ha: m1 > m2, rejection region is N1b ≤ c, where P(N1b ≤ c) = α, for Ha: m1 < m2, rejection region is N1b ≥ c, where P(N1b ≥ c) = α, and for Ha: m1 = m2, rejection region is N1b ≥ c1, or N1b ≤ c2, where

P(N1b≥c1 )=α2andP(N1b≤c2)=α2.

Assumptions: (1) Population distribution is continuous. (2) Samples are independent.

Note that since some observations can be equal to the overall median, and those values will be discarded, N1b need not be equal to n1. Let n1 + n2 = 2k. Under H0, N1b has a hypergeometric distribution given by

P(N1b=n1b)=(n1n1b)(n2k−n1b)(n1+n2k),n1b=0,1,2,…,n1,

with the assumption that (ij)=0, ifj>i. Note that the hypergeometric distribution is a discrete distribution that describes the number of “successes” in a sequence of n draws from a finite population without replacement. Thus, we can find the values of c, c1, and c2, required earlier. This calculation can be tedious. To overcome this, we can use the following large sample approximation valid for n1 > 5 and n2 > 5. First classify each observation as above or below the sample median as shown in Table 12.4.

Table 12.4. Data Classification With Respect to Median.

BelowAboveTotals
Sample 1 N1b N1a n1
Sample 2 N2b N2a n2
Total Nb Na n1 + n2 = n

It can be verified that the expected value and variance of N1a (similarly for N1b) are given by

E(N1a)=Nan1n,andVar(N1a)=Nan1n2Nb n2(n−1).

Thus, for a large sample we can write

z=N 1a−E(N1a)Var(N1a)∼N(0,1).

Hence, we can follow the usual large sample rejection region procedure, which is summarized next.

Summary of large sample median sum test (n1 > 5 and n2 > 5)

We test

H0:m1=m2versusHa:{m1>m2,upper tailedtestm1<m2,lowertailedtestm1≠m2,two-tailedtest.

The test statistic:

z=N1a−E(N1a)Var(N1a),

where

E(N1a)=Nan1n

and

Var(N1a)=Nan 1n2Nbn2(n−1).

Rejection region:

{z>zα,upper tailRRz<−zα,lowertailRR|z|>zα/ 2,twotailRR

Decision: Reject H0, if the test statistic falls in the RR, and conclude that Ha is true with (1 − α)100% confidence. Otherwise, do not reject H0, because there is not enough evidence to conclude that Ha is true for a given α and more data are needed.

Assumptions: (1) Population distributions are continuous. (2) n1 > 5 and n2 > 5.

We illustrate this procedure with the following example.

Example 12.4.1

Given below are the mileages (in thousands of miles) of two samples of automobile tires of two different brands, say I and II, before they wear out.

TireI:3432373542 717884TireII:3948546570768790111118126127

Use the median test to see whether the tire II gives more median mileage than tire I. Use α = 0.05.

Solution

We will test

H0:m1=m2versusH0:m1<m2.

Because the sample size assumption is satisfied, we will use the large sample normal approximation. The results of steps 1 and 2, using the notation A for above the median and B for below the median, are given in Table 12.5.

Table 12.5. Mileage Data Classification.

Sample valuesPopulationAbove/below the median
32 I B
34 I B
35 I B
37 I B
39 II B
42 I B
43 I B
47 I B
48 II B
54 II B
58 I B
59 I B
62 I B
65 II A
69 I A
70 II A
71 I A
76 II A
78 I A
84 I A
87 II A
90 II A
111 II A
118 II A
126 II A
127 II A

The median is 63.5. Thus, we obtain Table 12.6.

Table 12.6. Summary of Mileage Data for Automobile Tires.

BelowAboveTotals
Sample 1 N1b = 10 N1a = 4 n1 = 14
Sample 2 N2b = 3 N2a = 9 n2 = 12
Total Nb = 13 Na = 13 n1 + n2 = n = 26

Also,

EN1a=Nan1n=(13)(14)26=7,

and

V ar(N1a)=Nan1n2 Nbn2(n−1)=(13) (13)(14)(12)16,900=1.68.

Hence, the test statistic is

z=N1a−E(N1a)Var(N1a)=4−71.68=−2.31.

For α = 0.05, z0.05 = 1.645. Hence, the rejection region is {z < −1.645}. Because the observed value of z does fall in the rejection region, we reject H0 and conclude that there is enough evidence to conclude that there is a difference in the median mileage for the two types of tires.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128178157000129

Inferential Statistics II: Parametric Hypothesis Testing

Andrew P. King, Robert J. Eckersley, in Statistics for Biomedical Engineers and Scientists, 2019

5.8 1-tailed vs. 2-tailed Tests

All of the hypothesis test examples that we have seen so far have tested for any difference between the samples (or the sample and an expected value). We chose not to investigate which of the two was greater than the other. Sometimes we may be interested in this, and hypothesis tests can be applied in two different ways to reflect this need.

Hypothesis tests can be either 1-tailed or 2-tailed. Put simply, if we are interested in any difference between our two samples (or our one sample and an expected value), then we use a 2-tailed test. If we are interested in determining if a particular sample is either only greater than or only less than the other, then we use a 1-tailed test. Fig. 5.9 illustrates why these names are used. The curves represent a t-distribution for 10 degrees of freedom, and the shaded area in both cases corresponds to 5% of the total area under the curve. The left-hand plot shows the 2-tailed case, and the right-hand plot shows the 1-tailed case. The critical t-value for 10 degrees of freedom and α=0.05 is shown as 2.228 for a 2-tailed t-test (we can look this value up in Table A.1). Therefore any computed t-value inside the shaded area will result in rejection of the null hypothesis. The right-hand plot shows how the critical t-value changes when we are only interested in one tail. We still need to have 5% of the total area outside of the critical t-value, so the critical value must be smaller in magnitude. For this reason, it is easier to show statistical significance when using a 1-tailed test than a 2-tailed test. The critical t-values given in Table A.1 are for 2-tailed t-tests. To use these same values for a 1-tailed test, we should double the significance level (e.g. if we want a significance level of 0.05, then we look up the critical value from the 0.1 column).

What is the critical value for a one tailed hypothesis test in which a null hypothesis is tested at the 5% level of significance based on a sample size of 25?

Figure 5.9. An illustration of 1-tailed and 2-tailed t-tests. The curve shown in both figures is a t-distribution for 10 degrees of freedom. (A) A 2-tailed test: the shaded area corresponds to the range of calculated t-values that would result in rejection of the null hypothesis. In this case, we are interested in finding any difference between our two samples. (B) A 1-tailed test, in which we are only interested in finding if a particular one of our samples is greater than the other. Note that the total shaded area is the same in both cases and corresponds to 5% of the total area (i.e. 95% confidence). However, for a 1-tailed t-test, a lower critical t-value results, making it easier to show significance.

To illustrate the application of a 1-tailed test, we return to Professor A's original one-sample Student's t-test from Section 5.5. Recall that the absolute t-value was computed as 2.053, and we could not reject the null hypothesis because this was not greater than the (2-tailed) critical t-value of 2.776. Looking again at Table A.1, we find the 1-tailed critical t-value under the column for 0.1 significance level (this is double the actual significance level of 0.05). This is equal to 2.132, so in this case, it does not change the outcome of the test.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780081029398000141

Hypothesis Testing II

B.R. Martin, in Statistics for Physical Science, 2012

11.3.1 Sign Test

In Section 10.5 we posed the question of whether it is possible to test hypotheses about the average of a population when its distribution is unknown. One simple test that can do this is the sign test, and as an example of its use we will test the null hypothesis H0:μ=μ0 against some alternative, such as Ha:μ=μa or Ha:μ>μ0 using a random sample of size n in the case where n is small, so that the sampling distribution may not be normal. In general if we make no assumption about the form of the population distribution, then in the sign test and those that follow, μ refers to the median, but if we know that the population distribution is symmetric, then μ is the arithmetic mean. For simplicity the notation μ will be used for both cases.

We start by assigning a plus sign to all data values that exceed μ0 and a minus sign to all those that are less than μ0. We would expect the plus and minus signs to be approximately equal and any deviation would lead to rejection of the null hypothesis at some significance level. In principle, because we are dealing with a continuous distribution, no observation can in principle be exactly equal to μ0, but in practice approximate equality will occur depending on the precision with which the measurements are made. In these cases the points of ‘practical equality’ are removed from the data set and the value of n reduced accordingly. The test statistic X is the number of plus signs in the sample (or equally we could use the number of minus signs). If H0 is true, the probabilities of obtaining a plus or minus sign are equal to ½ and so X has a binomial distribution with p=p0=1/2. Significance levels can thus be obtained from the binomial distribution for one-sided and two-sided tests at any given level α.

For example, if the alternative hypothesis is Ha:μ>μ0, then the largest critical region of size not exceeding α is obtained from the inequality x≥kα, where

(11.22a)∑x=kα nB(x:n,p0)≤α,

and B is the binomial probability with p0 = p = ½ if H0 is true. Similarly, if Ha:μ<μ0, we form the inequality x≤k′α, where k′α is defined by

(11.22b)∑x=0k′αB(x :n,p0)≤α

Finally, if Ha: μ≠μ0, i.e., we have a two-tailed test, then the largest critical region is defined by

(11.22c)x≤k′ α/2andx≥kα/2.

For sample sizes greater than about 10, the normal approximation to the binomial may be used with mean μ=np and σ2=np(1−p).

EXAMPLE 11.7

A mobile phone battery needs to be regularly recharged even if no calls are made. Over 12 periods when charging was required, it was found that the intervals in hours between chargings were:

503545653938475243374440

Use the sign test to test at a 10% significance level the hypothesis that the battery needs recharging on average every 45 hours.

We are testing the null hypothesis H0:μ0=45 against the alternative Ha:μ0≠45. First we remove the data point with value 45, reducing n to 11, and then assign a plus sign to those measurements greater than 45 and a minus sign to those less than 45. This gives x=4 as the number of plus signs. As this is a two-tailed test, we need to find the values of k0.05 and k′0.05 for n=11. From Table C.2, these are k′0.05=3 and k0.05=9. Since x=4 lies in the acceptance region, we accept the null hypothesis at this significance level.

The sign test can be extended in a straightforward way to two-sample cases, for example, to test the hypothesis that μ1=μ2 using samples of size n drawn from two non-normal distributions. In this case the differences di (i=1,2,…,n) of each pair of observations is replaced by a plus or minus sign depending on whether di is greater than or less than zero, respectively. If the null hypothesis instead of being μ1−μ2=0 is instead μ1 −μ2=d, then the procedure is the same, but the quantity d is subtracted from each di before the test is made.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780123877604000111

Principles of Inference

Donna L. Mohr, ... Rudolf J. Freund, in Statistical Methods (Fourth Edition), 2022

Concept Questions

This section consists of some true/false questions regarding concepts of statistical inference. Indicate whether a statement is true or false and, if false, indicate what is required to make the statement true.

1.

______ In a hypothesis test, the p value is 0.043. This means that the null hypothesis would be rejected at α=0.05.

2.

______ If the null hypothesis is rejected by a one-tailed hypothesis test, then it will also be rejected by a two-tailed test.

3.

______ If a null hypothesis is rejected at the 0.01 level of significance, it will also be rejected at the 0.05 level of significance.

4.

______ If the test statistic falls in the rejection region, the null hypothesis has been proven to be true.

5.

______ The risk of a type II error is directly controlled in a hypothesis test by establishing a specific significance level.

6.

______ If the null hypothesis is true, increasing only the sample size will increase the probability of rejecting the null hypothesis.

7.

______ If the null hypothesis is false, increasing the level of significance (α) for a specified sample size will increase the probability of rejecting the null hypothesis.

8.

______ If we decrease the confidence coefficient for a fixed n, we decrease the width of the confidence interval.

9.

______ If a 95% confidence interval on μ was from 50.5 to 60.6, we would reject the null hypothesis that μ=60 at the 0.05 level of significance.

10.

______ If the sample size is increased and the level of confidence is decreased, the width of the confidence interval will increase.

11.

______ A research article reports that a 95% confidence interval for mean reaction time is from 0.25 to 0.29 seconds. About 95% of individuals will have reaction times in this interval.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128230435000035

Hypothesis testing

Kandethody M. Ramachandran, Chris P. Tsokos, in Mathematical Statistics with Applications in R (Third Edition), 2021

6.5.1.1 Equal variances

Given next is the procedure we follow to compare the true means from two independent normal populations when n1 and n2 are small (n1 < 30 or n2 < 30) and we can assume homogeneity in the population variances, that is, σ12=σ22. In this case, we pool the sample variances to obtain a point estimate of the common variance.

Comparison of two population means, small sample case (pooled t-test)

We want to test:

H0:μ1−μ2=D0

versus

μ1−μ2 >D0,upper tailed testHa:μ1− μ2<D0,lower tailed testμ1−μ2 ≠D0,two-tailed test.

The TS is:

T=X¯1−X¯2−D0Sp1n1+1n2.

Here the pooled sample variance is:

Sp2=(n1−1)S12+(n2−1)S22n1+n2 −2.

Then the RR is:

RR:{t>tα ,upper tailed testt<−tα,lower tail test|t|>tα/2,two-tailed test

where t is the observed TS and tα is based on (n1 + n2 − 2) degrees of freedom, and such that P(T > tα) = α.

Decision: Reject H0, if TS falls in the RR, and conclude that Ha is true with (1 − α)100% confidence. Otherwise, do not reject H0 because there is not enough evidence to conclude that Ha is true for a given α.

Assumptions: The samples are independent and come from normal populations with means μ1 and μ2, and with (unknown) equal variances, that is, σ12=σ22.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128178157000063

Nonparametric Methods

Donna L. Mohr, ... Rudolf J. Freund, in Statistical Methods (Fourth Edition), 2022

14.3 Two Independent Samples

The Mann–Whitney test (also called the Wilcoxon rank sum or Wilcoxon two-sample test) is a rank-based nonparametric test for comparing the location of two populations using independent samples. Note that this test does not specify an inference to any particular parameter of location. Using independent samples of n1 and n2, respectively, the test is conducted as follows:

1.

Rank all (n1+n2) observations as if they came from one sample, adjusting for ties.

2.

Compute T, the sum of ranks for the smaller sample.

3.

Compute T′=(n1+n2)(n1+n2+ 1)∕2−T, the sum of ranks for the larger sample. This is necessary to assure a two-tailed test.

4.

For small samples (n1+n2≤30), compare the smaller of T and T ′ with the rejection region consisting of values less than or equal to the critical values given in Appendix Table A.9. If either T or T′ falls in the rejection region, we reject the null hypothesis. Note that even though this is a two-tailed test, we only use the lower quantiles of the tabled distribution.

5.

For large samples, the statistic T or T′(whichever is smaller) has an approximately normal distribution with

μ=n1(n1+n2+1)∕2andσ2= n1n2(n1+n 2+1)∕12.

The sample size n1 should be taken to correspond to whichever value, T or T′, has been selected as the test statistic.

These parameter values are used to compute a test statistic having a standard normal distribution. We then reject the null hypothesis if the value of the test statistic is smaller than −z α∕2. Modifications are available when there are a large number of ties (for example, Conover, 1999).

The procedure for a one-sided alternative hypothesis depends on the direction of the hypothesis. For example, if the alternative hypothesis is that the location of population 1 has a smaller value than that of population 2 (a one-sided hypothesis), then we would sum the ranks from sample 1 and use that sum as the test statistic. We would reject the null hypothesis of equal distributions if this sum is less than the α∕2 quantile of the table. If the one-sided alternative hypothesis is the other direction, we would use the sum of ranks from sample 2 with the same rejection criteria.

Example 14.4

Tasting Scores

Because the taste of food is impossible to quantify, results of tasting experiments are often given in ordinal form, usually expressed as ranks or scores. In this experiment two types of hamburger substitutes were tested for quality of taste. Five sample hamburgers of type A and five of type B were scored from best (1) to worst (10). Although these responses may appear to be ratio variables (and are often analyzed using this definition), they are more appropriately classified as being in the ordinal scale. The results of the taste test are given in Table 14.4. The hypotheses of interest are

H0:the types of hamburgers have the same quality of taste, andH1:they have different quality of taste.

Table 14.4. Hamburger taste test.

Type of BurgerScore
A 1
A 2
A 3
B 4
A 5
A 6
B 7
B 8
B 9
B 10

Solution

Because the responses are ordinal, we use the Mann–Whitney test. Using these data we compute

T=1+2+3 +5+6=17andT′=10(11 )∕2−17=38.

Choosing α=0.05 and using Appendix Table A.9, we reject H0 if the smaller of T or T′ is less than or equal to 17. The computed value of the test statistic is 17; hence we reject the null hypothesis at α=0.05, and conclude that the two types differ in quality of taste. If we had to choose one or the other, we would choose burger type A based on the fact that it has the smaller rank sum.

Randomization Approach to Example 14.4

Since this data set does not contain any ties, Appendix Table A.9 is accurate. If we wished a p value, we could enumerate all the 10!∕ (5!5!)=252 ways the ranks 1 through 10 could be split into two groups of five each. Listing the corresponding pseudo-value of T would show that there were 3.17% of them at or less than 17. Hence, the exact p value is 0.0317, which agrees with the value from SAS System’s PROC NPAR1WAY. Using the normal asymptotic approximation gives z=2.193, with a p value of 0.028, which is surprisingly close given the small sample size.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B978012823043500014X

What is the critical value for a one

Both a and b are correct. For a one-tailed test using the 0.05 level of significance, the critical value for the z test is 1.645, but for t it is 1.96.

What is the critical value for a one

The significance value used in a one-tailed test is either 1%, 5%, or 10%, although any other probability measurement can be used at the discretion of the analyst or statistician. The probability value is calculated with the assumption that the null hypothesis is true.

What is the critical value in one

For a left-tail test at the 0.05 level of significance, the critical value is zα = − 2.33.

What is the critical value of 0.05 in a one

For example, in an upper tailed Z test, if α =0.05 then the critical value is Z=1.645.