In this post, I discuss how you can interpret ordinal data. These are data that have a clear rank order, such as the data produced by Likert scales. In the paragraphs that follow, I will discuss how to find out the median and interquartile range, and I will also offer some suggestions for reporting your findings.
When writing this post, I will make some assumptions about you. I will assume, for example, that you are doing research in the humanities and social sciences. This means that I will avoid finer statistical nuance. This level of analysis should be enough if you’re doing a term paper or MA dissertation. However, if you are in a structured learning programme, you should clarify expectations with your tutors. I will also assume that you do not have access to a statistical package, such as SPSS, so I will not give you detailed instructions on how to input and process data; rather, I will just focus on the concepts involved. Lastly, I will be using Likert scale data to illustrate my points, but what I write is transferable to all kinds of ordinal data.
How to analyse Likert-scale data
What are ordinal data
Let’s assume that you have prepared a questionnaire, where respondents had to select among responses ranging from “strongly agree” to “strongly disagree”. For convenience, you have probably followed the established practice of replacing these responses with numbers: “1” for “strongly disagree”, “2” for “agree” and so on. This questionnaire produces what we call ordinal data. Ordinal data can be ranked: we can say that “strongly agree” indicates more agreement than “agree” or “undecided”. But we cannot claim that four “strong disagree” responses equal an “agree”. We need to be careful about what kind of calculations we can do, and what doesn’t work.
Calculating central tendency and spread for ordinal data
So what kinds of analysis should we do? There are two types of statistical analysis, descriptive and inferential statistics. For most small-scale projects, where you just want to find out what respondents believe about a topic, descriptive statistics are enough. This involves, for example, finding the central tendency (what most respondents believe) and the spread / dispersion of the responses (how strongly respondents agree with each other). The table below shows how to estimate these.
|Type of data||Central tendency||Dispersion|
Because Likert scales produce ordinal data, I suggest that you calculate the median and Inter-Quartile Range (IQR) of each item.
- The median (i.e., the number found exactly in the middle of the distribution) is a measure of central tendency: very roughly speaking, it shows what the ‘average’ respondent might think, or the ‘likeliest’ response, in a way that makes sense for this type of data.
- The IQR is a measure of spread: it shows whether the responses are clustered together or scattered across the range of possible responses. It is not as precise as the standard deviation, which you may have heard about, but it is good enough.
You can find some instructions on how to calculate these metrics with SPSS in this page (the procedure is the same for both). If you only have access to Excel, here are links to a couple of videos demonstrating how to calculate the median and the IQR. For small datasets, it is easy to calculate the median and IQR manually. In the next two sections, I shall show how this can be done, using the example data. If you don’t want to read these, you can skip to the bottom, for some advice about how to report the findings.
Calculating the median
First, you arrange the numbers in an order from largest to smallest, like this:
To compute the median, you then delete one number from each end of the line, and repeat until you are left with just one number (or two that are the same). This ‘middle’ number is your median. If you are left with two different numbers in the end, the median is half-way between them. This will produce a decimal (e.g., 2.5), which might seem odd, but that’s ok. Using the data you provided, the median is 3, and I have marked it with red to make it stand out.
Calculating the IQR
The IQR is slightly more complicated, but not too hard. Your starting point will be the same arrangement of responses that we used above. When you divide this line into four equal parts, the ‘cut-off’ points are called quartiles. I have used red to indicate quartiles in the dataset.
[1,1,2,2,2,2,2,2,2,2,2,3,3,3, 3] [3,3,3,3,3,3,3,3,3,3,3,3,3,3, 3][3,3,3,3,3,4,4,4,4,4,4,4,4,4, 4] [4,4,4,4,4,4,4,4,5,5,5,5,5,5, 5]
The IQR is the difference between the first and third quartile. In the example, this is: Q3 – Q1 = 4 – 3 = 1.
A relatively small IQR, as was the case above, is an indication of consensus. By contrast, larger IQRs might suggest that opinion is polarised, i.e., that respondents tend to hold strong opinions either for or against this topic.
Reporting your findings
When your findings suggest consensus, your write-up should focus on describing the median (i.e., what most respondents seem to believe). One way to describe this is by writing something like the following:
Most respondents indicated agreement with the idea that… (Mdn=4, IQR=0).
By contrast, when opinion is polarised, your write-up should emphasise the dissonance of opinion: the median is perhaps not so important. To help you understand this, consider a hypothetical case where half of your respondents hate a new textbook, and half love it. If you were to simply report that the respondents are, on average, undecided, that would be a statistical distortion of the data. Here’s a possible way to report the data more accurately:
Opinion seems to be divided about… . Many respondents (n=28, 47%) expressed strong disagreement or disagreement, but a roughly equal number (n=26, 43%) indicated that they agreed or strongly agreed (Mdn=3, IQR=3).”
A final caveat
One last thing: I would caution you against placing too much faith on findings that were generated from a single Likert-type item. Individual items are very sensitive to factors such as wording, sequencing and more, so you cannot be sure what they really show.
If at all possible, I’d try to cluster similar items together and compare / merge their results. Such groupings of items are called Likert Scales, and they tend to be more robust. If the responses in a scale are broadly consistent, that should give you confidence that you are measuring something reliably. If they are not, it might mean that one of the items is not functionioning properly (e.g., respondents may have been confused by the wording), and you may have to discard it from the dataset.
More to read
The following presentation contains some more detailed information about doing statistical research in applied linguistics and language education.
You may also want to check out some more posts I have written on quantitative research for language teaching, including:
Before you go
If you arrived at this page while preparing for one of your student projects, I wish you all the best with your work. There’s a range of social sharing buttons below, in case you feel like sharing this information among fellow students who might also find it useful. Also feel free to ask any other questions you may have, using the contact form.
Achilleas Kostoulas is an applied linguist and language teacher educator. He teaches at the Department of Primary Education at the University of Thessaly, Greece. Previous academic affiliations include the University of Graz, Austria, and the University of Manchester, UK (which is also where he was awarded his PhD). He has extensive experience teaching research methods in the context of language teacher education.
About this post: This post was originally written on 23 February 2013, in response to a question a student asked me. It was last revised on 26 April 2023.