|
Statistics for the Behavioral Sciences |
Lesson 4 Measures of Central Tendency |
Roger N. Morrissette, PhD |
I. Measures of Central Tendency
Measures of Central Tendency are scores that represent the center of a distribution of data. They consist of the Mean, Median, and Mode. We will calculate all three of these statistics of Central Tendency for both Raw Data and Grouped Frequency Data.
II. The Mean
The Mean is the most commonly used measure of central tendency.
A. Raw Score Calculation:
X bar is the symbol for the sample mean. The symbol for "the sum of" is the Greek symbol sigma. To calculate the sample mean for raw data you first take the sum of all the raw scores in the sample and divide them the by the total number of raw scores in the sample.

The formula reads: X bar equals the sum of X (or the sum of all the raw scores in the sample) divided by n (the sample size or number of raw data scores in the sample).
Mu is the symbol for the population mean. To calculate the population mean for raw data you first take the sum of all the raw scores in the population and divide them by the total number of raw scores in the population.

The formula reads: Mu equals the sum of X (or the sum of all the raw scores in the population) divided by N (the population size or number of raw data scores in the population).
B. Grouped Frequency Data Calculation:
When you do not have raw data but instead have only Grouped Frequency Data, as is shown in the table below, the calculation of the mean is a bit different.
Apparent Limits Frequency 81-90 5 71-80 3 61-70 12 51-60 16 41-50 33 31-40 21 21-30 15 11-20 7 Total 112
The formula for calculating the Mean for Grouped Frequency Data is:
The formula reads: X bar equals the sum of all the frequency times midpoint scores divided by n.
To solve the formula we first make a new column and calculate the midpoints.
Apparent Limits Frequency Midpoints 81-90 5 85.5 71-80 3 75.5 61-70 12 65.5 51-60 16 55.5 41-50 33 45.5 31-40 21 35.5 21-30 15 25.5 11-20 7 15.5 Total 112
Then we generate another column of data that represents the multiplication of the frequency of each interval by its midpoint.
Apparent Limits Frequency Midpoints Frequency x Midpoints 81-90 5 85.5 427.5 71-80 3 75.5 226.5 61-70 12 65.5 786 51-60 16 55.5 888 41-50 33 45.5 1501.5 31-40 21 35.5 745.5 21-30 15 25.5 382.5 11-20 7 15.5 108.5 Total 112
We then sum the values in the Frequency x Midpoint column to get the numerator of the equation below.
Apparent Limits Frequency Midpoints Frequency x Midpoints 81-90 5 85.5 427.5 71-80 3 75.5 226.5 61-70 12 65.5 786 51-60 16 55.5 888 41-50 33 45.5 1501.5 31-40 21 35.5 745.5 21-30 15 25.5 382.5 11-20 7 15.5 108.5 Total 112 5066
Finally we take this sum and divide it by the overall sample size (which is the sum of all the frequencies or in this case n = 112).
5066/112 = 45.23 or the Mean of the Grouped Frequency Distribution
A. Raw Score Calculation:
The median is the middle score of a ranked distribution of raw scores. It is the value where half the scores fall above and half the scores fall below. For an even number of scores, the median is the average of the two middle scores. For small distributions the calculation is fairly easy as is shown below for this small set of raw data:.
10 23 2 34 17 5 3 12 43 25 44 17 7 8
The first step is to rank order all of the raw scores:
2 3 5 7 8 10 12 17 17 23 25 34 43 44
Then count the total number of scores (in this case n = 14) and divide by two (7). Now count in by the number you just calculated from both ends of your distribution until you find the middle score or scores.
2 3 5 7 8 10 12 17 17 23 25 34 43 44
The median falls between 12 and 17. So add the two together and divide by 2 to find the actual median: 12 + 17 / 2 = 14.5
For large distributions you can use the following formulas and rank only half of the raw scores:
For a raw score distribution with a total sample size that is odd use:

The formula reads: For a large distribution with an odd number of scores, the median will equal the number score that is equal to the sample size plus 1 and then divided by two.
For example: If you have a distribution of 101 scores, the median will equal the number score that is 101 + 1 / 2 or 51. If you rank order the 101 scores and count in to the 51st score, that will be your median.
For a raw score distribution with a total sample size that is even use:

The formula reads: For a large distribution with an even number of scores, the median will equal the number score that is equal to the sum of the sample size plus 2 and then divided by two added to the score that is equal to the sample size divided by two. Once you have added these two scores then divide their sum by 2.
For example: If you have a distribution of 102 scores, the median will equal the number score that is 102 + 2 / 2 or 52 added to 102 / 2 = 51. Then add 52 and 51 together and divide by two: 51 + 52 / 2 = 51.5. If you rank order the 102 scores and count in to the 51st and 52nd score (where 51.5 would reside), the median would equal the average of those two scores.
B. Grouped Frequency Data Calculation:
To calculate the Median of a Grouped Frequency Distribution you have to generate a Frequency Distribution table that has real limits, apparent limits, frequency, and cumulative frequency. The formula to use is given below:

where:
L = the lower real limit of the interval that contains the median
n = the sample size
CFb = the cumulative frequency in the interval below the one that contains the
median
Fi = the frequency in the interval that contains the median
i = the interval size
The formula reads: The Grouped Frequency Median equals the lower real limit that contains the median plus the sum of the cumulative frequency in the interval below the one that contains the median subtracted from the sample size divided by two, then this product divided by the frequency in the interval that contains the median, and then multiplied by the interval size.
The first step in solving this formula is to generate a Frequency Distribution table that has real limits, apparent limits, frequency, and cumulative frequency like the one shown below:
Real Limits Apparent Limits Frequency Cumulative Frequency 299.5-324.5 300-324 10 1000 274.5-299.5 275-299 25 990 249.5-274.5 250-274 69 965 224.5-249.5 225-249 146 896 199.5-224.5 200-224 247 750 174.5-199.5 175-199 206 503 149.5-174.5 150-174 147 297 124.5-149.5 125-149 104 150 99.5-124.5 100-124 32 46 74.5-99.5 75-99 14 14 The second VERY IMPORTANT step is to determine the interval that contains the median. CAUTION: This interval is not necessarily the interval in the middle of the distribution. To calculate the interval that contains the median you need to start with the sample size and divide it by 2. In our example above, n = 1000. So 1000 / 2 = 500. Next we must determine the interval that has the 500th score since that is the middle or median score. The best way to do this is to use the cumulative frequency column. Remember that cumulative frequency adds the frequency of raw scores as it goes up. The first interval (74.5-99.5) contains the first 14 scores, scores 1 to 14. The second interval contains scores 15 to 46 and so on. The fifth interval from the bottom (174.5-199.5) contains scores 298 to 503 so the 500th score is in this interval. This makes this interval (174.5-199.5) the one that contains the median.
The next step is to find all the values in the formula and place them into the formula.
L = the lower real limit of the interval that contains the median
As the table shows, the lower real limit in the interval that contains the median = 174.5.
| Real Limits | Apparent Limits | Frequency | Cumulative Frequency |
| 299.5-324.5 | 300-324 | 10 | 1000 |
| 274.5-299.5 | 275-299 | 25 | 990 |
| 249.5-274.5 | 250-274 | 69 | 965 |
| 224.5-249.5 | 225-249 | 146 | 896 |
| 199.5-224.5 | 200-224 | 247 | 750 |
| 174.5-199.5 | 175-199 | 206 | 503 |
| 149.5-174.5 | 150-174 | 147 | 297 |
| 124.5-149.5 | 125-149 | 104 | 150 |
| 99.5-124.5 | 100-124 | 32 | 46 |
| 74.5-99.5 | 75-99 | 14 | 14 |
n = the sample size which we already know is 1000
CFb = the cumulative frequency in the interval below the one that contains the
median
As the table shows, the cumulative frequency in the interval below the one that contains the median = 297
| Real Limits | Apparent Limits | Frequency | Cumulative Frequency |
| 299.5-324.5 | 300-324 | 10 | 1000 |
| 274.5-299.5 | 275-299 | 25 | 990 |
| 249.5-274.5 | 250-274 | 69 | 965 |
| 224.5-249.5 | 225-249 | 146 | 896 |
| 199.5-224.5 | 200-224 | 247 | 750 |
| 174.5-199.5 | 175-199 | 206 | 503 |
| 149.5-174.5 | 150-174 | 147 | 297 |
| 124.5-149.5 | 125-149 | 104 | 150 |
| 99.5-124.5 | 100-124 | 32 | 46 |
| 74.5-99.5 | 75-99 | 14 | 14 |
Fi = the frequency in the interval that contains the median
As the table shows, the frequency in the interval that contains the median = 206
Real Limits Apparent Limits Frequency Cumulative Frequency 299.5-324.5
300-324 10 1000 274.5-299.5 275-299 25 990 249.5-274.5 250-274 69 965 224.5-249.5 225-249 146 896 199.5-224.5 200-224 247 750 174.5-199.5 175-199 206 503 149.5-174.5 150-174 147 297 124.5-149.5 125-149 104 150 99.5-124.5 100-124 32 46 74.5-99.5 75-99 14 14
i = the interval size
As the table shows, the interval size = 25
| Real Limits | Apparent Limits | Frequency | Cumulative Frequency |
| 299.5-324.5 | 300-324 | 10 | 1000 |
| 274.5-299.5 | 275-299 | 25 | 990 |
| 249.5-274.5 | 250-274 | 69 | 965 |
| 224.5-249.5 | 225-249 | 146 | 896 |
| 199.5-224.5 | 200-224 | 247 | 750 |
| 174.5-199.5 | 175-199 | 206 | 503 |
| 149.5-174.5 | 150-174 | 147 | 297 |
| 124.5-149.5 | 125-149 | 104 | 150 |
| 99.5-124.5 | 100-124 | 32 | 46 |
| 74.5-99.5 | 75-99 | 14 | 14 |
The final step, of course, is to plug all of these values into the formula and solve the formula:

1. 1000 / 2 = 500
2. 500 - 297 = 203
3. 203 / 206 = 0.985
4. 0.985 x 25 = 24.625
5. 24.625 + 174.5 = 199.125
The Median for this Group Frequency Distribution is 199.125
A. Raw Score Calculation:
The Mode is the most frequent score in a distribution. Although you do not have to rank order these raw scores to determine the mode:
10 23 2 34 17 5 3 12 43 25 44 17 7 8
Rank ordering makes the mode stand out:
2 3 5 7 8 10 12 17 17 23 25 34 43 44
There is only one raw score that appears twice. Therefore, the mode of this raw score distribution is 17. It is the most frequently occurring score.
If a distribution has one mode it is said to be unimodal. If it has two modes it is bimodal, if it has more than two modes then it is said to me multimodal. It is also possible for a distribution to not have any mode.
B. Grouped Frequency Data Calculation:
For a Group Frequency Distribution, the mode is the midpoint of the interval with the highest frequency. The Frequency Distribution table below has its mode highlighted:
| Apparent Limits | Frequency | Midpoints |
| 81-90 | 5 | 85.5 |
| 71-80 | 3 | 75.5 |
| 61-70 | 12 | 65.5 |
| 51-60 | 16 | 55.5 |
| 41-50 | 33 | 45.5 |
| 31-40 | 21 | 35.5 |
| 21-30 | 15 | 25.5 |
| 11-20 | 7 | 15.5 |
| Total | 404 |
The mode of this Grouped Frequency Distribution is 45.5.
Skew refers to the general shape of a distribution when it is graphed. There are three basic types of skewing that can occur with any data set:
A Normal Distribution (discussed at length in Chapter 7) or Bell-Shaped Curve is said to have No Skew. The distribution is symmetrical. In a normal distribution, the mean, median, and mode are all the same. In this case it is best to use the mean as the measure of central tendency. The figure below represents a normal distribution curve with no or zero skew:

A Positively Skewed distribution of data is not symmetrical. The tail of the distribution goes toward the positive end of the curve. In a positively skewed distribution, the mean, median, and mode are not the same. In this case it is best to use the median or the mode as the measure of central tendency. The figure below represents a positively skewed distribution curve:
A Negatively Skewed distribution of data is also not symmetrical. The tail of the distribution goes toward the negative end of the curve. In a negatively skewed distribution, the mean, median, and mode are not the same. In this case it is best to use the median or the mode as the measure of central tendency. The figure below represents a negatively skewed distribution curve: