What is the term associated with scores that are at the extreme ends of the distribution?

Describing a distribution of test scores.

Measures of central tendency are used to describe the center of the distribution.  There are three measures commonly used:

            Mean,             arithmetic average of the scores

                                    simple calculation.

                                Xbar = sum of X divided by N

                                     find the mean for the following data set 

                                    data set of 5 scores:  32,25,28,30,20

            Median           50th percentile
                        Have already learned how to calculate median (see chapter 2 notes)

            Mode              is the score occurring most frequently in the distribution

 find the mode:

data set of scores  14,23,21,11,14,32,25,23,35,22,21,33,23

Measures of variability determine the spread of scores from a data set.

Three measures

                        Range                                (see chapter 2 notes)

                        Interpercentile range        Indication of how scores vary around the median. Most        

                        widely used is interquartile range where x.75 � and x. 25 is used to  look at spread. 
                        We then take the difference between the two a smaller number represents smaller
                        ranges means less spread.

                         Can use other interquartile ranges such as x.75 � x.25

                        interquartile deviation =interquartile range divided by 2.

                        Standard Deviation which used the mean to estimate.  Tell how much scores
                        deviate from the mean.  The Greek lowercase sigma is used to represent it in a
                        population while a lower case s is used to represent it in a sample

Lets learn about how to determine the s.

!st take the following numbers and find the mean. (N=5; SX =93)  93/5 = 18.693/5 = 18.6

  X                                           X 2

30                                            900                                      

24                                            576

16                                            256                                        

13                                            169                                       

10                                            100         

SX=93                              X 2=2001

these squared deviations are termed variance. And is represented by a lower case sigma squared. Or s squared.  s2 = NSX2 � (SX)2

                          s2 =67.7

Once the variance is known you can determine the standard deviation. 

 We determine the standard deviation is computed by getting the square root of the variance

 To compute standard deviation we use the following formula;

s =  the square root of NSX2 � (SX)2

                                               N(N-1)

s = 8.23

Or the square root of the variance.

Score Distributions:

We can graph tests scores in the form of a curve. We know the form of curve depends on the way the scores are distributed.  The most common types of curves are represented in data sets at this level are the normal curve and skewed curve.

Normal Curve

 If we were to administer a test to a large population and graphed them we would most likely see a curve that is similar to a normal curve (bell shaped with known properties).

Characteristics of a normal curve

1.      symmetrical

2.      mean, median and mode are identical

3.      the area under the curve is = to 100%

4.      on the baseline the mean is placed in the center and 3 marks are placed an equal distance apart to represent three s.d.�s above and below the mean,  this divides the curve into 6 parts.

5.      The curve never touches the baseline. Thus a score four s.d.s or more are possible above and below the the mean

What we can learn from a bell-shaped curve.

1.      84.1% of the scores fall within one standard deviation above the mean.

2.      97.7% fall with 2 s.d.�s above

3.      99.9% fall within 3 s.d.�s

lets say we have a mean of 35 on a test and a s.d. of 8.  We can say that at a score of 43 or one s.d. above the mean, 84% of the students scored at that level or below.  Or if we subtract, and get a .score of 27  or one s.d. below the mean we would report 16% of the students scored at or below this level.

A bell shaped curve is associated with a normal curve.  It represents most scores being in the middle of the distribution.  The two ends are called the tail. In a normal curve most scores obtain scores surrounding the average. The fewest scores are at the ends representing the high and low scores. In most class room setting you normally won�t find a normal curve.

Usually we will find that student�s will do well or poorly.  In this case the curve will form a skewed distribution. In such  a curve the mean median and mode will not be equal.

When the scores are mostly low, a curve is said to be positively skewed.

This is because the majority of the scores fall in the lower part of the distribution.

With few high scores causing little or no tail on the left and then a longer tail on the left.  It is not symmetrical. If the test is negatively skewed.  The majority of the scores fall in the upper part of the distribution or to the right.  With many high scores causing little or no tail. On the left and the left end of the tail will have a longer tail.  It is not symmetrical. This type of skew is wanted if a mastery test is given it shows the majority of the students mastered the material.

What is the best way to describe these distributions?

Should you use a mean and standard deviation or median and interpercentile range?

It depends: we should use the following information to make that decision.

1.      place in a frequency distribution

2.      graph as a frequency polygon

3.      examine graph and decide if it approaches a normal curve

  1. if it does use the mean and standard deviation
  2. if it doesn�t use the median and the interpecentile ran.

Why if the scores are skewed the mean will be in the skewed tail of the curve.

            It is more affected by the extreme scores in the tail.

            In this case the median better represents the center.

We use the interpercentile as the measure of variability because it is the corresponding measure.

Standard Scores

A standard score is one that is standardized by taking the deviation of the score from the mean and dividing it by the standard deviation. When using median and interpercentile range to interpret use percentile ranks to convert scores. When using mean and s.d. use a standard score transformation

Two types of score transformations

z-score transformation and t-score transformation.

To use the z score transformation or standard deviation unit. Use the following formula. 3-6.

What it does.

It forms a distribution with fixed parameters

In this distribution. The mean is 0 and a standard deviation of 1

Look at a figure 3-6.

The scores above the mean are positive; below the mean negative.  It doesn�t matter the unit of measure of the test. 

Allows you to compare scores that normally could not be measured.  I.e. fitness tests where time can be compared to distance.

Certain scores are easily determined even without the use of a formula.

 For example a if the mean was 30 and the s.d. was 5. A score of 35 would be 1 and a score of 25 would be �1.

What is an extreme score called in statistics?

The extreme values which are also known as outliers are the values that are too far from the other observations of the given data.

What is the term for a distribution that is significantly distorted?

Skewness is a measurement of the distortion of symmetrical distribution or asymmetry in a data set. Skewness is demonstrated on a bell curve when data points are not distributed symmetrically to the left and right sides of the median on a bell curve.

Is used when there are extreme scores in the distribution?

Median is the preferred measure of central tendency when: There are a few extreme scores in the distribution of the data.

Which of the following is sensitive to extreme scores in a distribution?

The mean is more sensitive to the existence of outliers than the median or mode.