Está en la página 1de 3

Mean is meaningless

The most common mistake in interpreting data produced by Likert scales is generating
mean values for responses. I have ranted about this practice elsewhere, but here’s the
gist: To facilitate coding or save space on a questionnaire, we sometimes substitute
Likert-type data with numbers (e.g., Figure 4, top). However, these numbers are just
descriptive codes, devoid of numerical value, except in the sense that they can be ranked
(thanks, David, for pointing out my unhelpful wording!). Put differently, while a response
of Strongly Agree shows more agreement than Agree, a response of Strongly Agree (5)
does not show agreement that is five times stronger than Strongly Disagree (1). We could
just as easily have used colours, or any other symbol to show the same effect (e.g., Figure
4, bottom).

Figure 4 Science Fiction Attitude Survey


What all this means is that adding, multiplying or dividing Likert-type values does not
make mathematical sense. It also means that calculating mean values makes as much
sense as suggesting that ‘the average of a watermelon and two strawberries is an apple’.
Other metrics, such as the median or the mode are more appropriate. It follows that, one
should avoid statistical methods that rely on the mean: variability in a Likert scale should
be estimated using Range and Inter-Quartile Range, but not Standard Deviation.
Parametric tests, such as t-tests, must be avoided, and non-parametric tests such as
the Mann‐Whitney U-test, the Wilcoxon signed‐rank test and the Kruskal‐Wallis test
might be used in their place. For similar reasons bar charts, rather than histograms, must
be used to present data.

[NB. Some very well-designed Likert scales can, indeed, produce data that are suitable
for calculating means, or running statistical tests that rely on the mean. These scales are
the product of careful weighting and extensive testing across large numbers of
respondents. Unless one is an experienced statistician, one is advised to not follow their
example.]
HOW TO INTERPRET ORDINAL DATA
23 FEBRUARY 2014 ACHILLEAS 54 COMMENTS

The following (slightly modified) question was posted as a comment here, but I felt that the
answer was too lengthy for the comments section.
Our questionnaire is composed of items with a 5 point scale, ranging from
“1=strongly disagree” to “5=strongly agree”. For example, we are trying to find
out if the respondents agree with [a topic]. The number of respondents who
‘strongly disagree’ are 2, those who ‘disagree’ are 9, those who ‘are undecided’
are 24, those who ‘agree’ are 18 and those who ‘strongly agree’ are 7. How do
I interpret this data?

For such data, I suggest that you calculate the median and Inter-Quartile Range (IQR) of
each item. The median (: the number found exactly in the middle of the distribution) is a
measure of central tendency: very roughly speaking, it shows what the ‘average’ respondent
might think, or the ‘likeliest’ response. The IQR is a measure of dispersion: it shows whether
the responses are clustered together or scattered across the range of possible responses.
You can find some instructions on how to do calculate these metrics with SPSS in this
page (the procedure is the same for both). If you only have access to Excel, here are links to
a couple of videos demonstrating how to calculate the median and the IQR. For small
datasets, such as yours, it is easy to calculate the median and IQR manually. In the next two
sections, I shall show you how, using your example data.

Calculating the median


First, you arrange the numbers in an order from largest to smallest, like this:

1,1,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3, 3,3, 3,3,3,3,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,


4,4,4,5,5,5,5,5,5,5

To compute the median, you then delete one number from each end of the line, and repeat
until you are left with just one number (or two that are the same). This ‘middle’ number is
your median. If you are left with two different numbers in the end, the median is half-way
between them. Using the data you provided, the median is 3, and I have marked it with red
to make it stand out.

Calculating the IQR


The IQR is slightly more complicated, but not too hard. Your starting point will be the same
arrangement of responses that we used above. When you divide this line into four equal
parts, the ‘cut-off’ points are called quartiles. I have used red to indicate quartiles in your
dataset.
[1,1,2,2,2,2,2,2,2,2,2,3,3,3, 3] [3,3,3,3,3,3,3,3,3,3,3,3,3,3, 3][3,3,3,3,3,4,4,4,4,4,4,4,4,4, 4]
[4,4,4,4,4,4,4,4,5,5,5,5,5,5, 5]
The IQR is the difference between the first and third quartile. In your example, this is: Q3 –
Q1 = 4 – 3 = 1.

A relatively small IQR, as was the case above, is an indication of consensus. By contrast, larger
IQRs might suggest that opinion is polarised, i.e., that your respondents tend to hold strong
opinions either for or against this topic.

Reporting the data


When your findings suggest consensus, your write-up should focus on describing the median
(i.e., what most respondents seem to believe). One way to describe this is by writing
something like: “most respondents indicated agreement with the idea that… (Mdn=4, IQR=0)”.
By contrast, when opinion is polarised, your write-up should emphasise the dissonance of
opinion: the median is perhaps not so important. To help you understand this, consider a
hypothetical case where half of your respondents hate a new textbook, and half love it. If you
were to simply report that the respondents are, on average, undecided, that would be a
statistical distortion of the data. Here’s a possible way to report the data more accurately:
“Opinion seems to be divided with regard to… . Many respondents (N=28, 47%) expressed
strong disagreement or disagreement, but a roughly equal number (N=26, 43%) indicated that
they agreed or strongly agreed (Mdn=3, IQR=3).“

A final caveat
One last thing: I would caution you against placing too much faith on findings that were
generated from a single Likert-type item. If at all possible, I’d try to cluster similar items
together and compare / merge their results. If the findings are broadly consistent, that gives
us confidence in them. If they are not, it might mean that one of the items did not function
properly (e.g., respondents may have been confused by the wording), and you may have to
discard it from the dataset.

ORDINAL SCALE
a scale on which data is shown simply in order ofmagnitude since there is no sta
ndard of measurement of differences:for instance, a squash ladder is an ordinal
scale since one can sayonly that one person is better than another, but not by h
ow much

También podría gustarte