How to use Likert scales effectively (Achilleas Kostoulas)

Questionnaire surveys using Likert scales are one of the most popular research designs in language education research. They are simple and straightforward, and when done properly, they can produce lots of very useful information about language teaching and learning. The problem is that, once you have administered the questionnaire, there’s no way of going back to the respondents and asking follow-up questions, or clarifying the wording of your questions.

To help avoid potential problems, in this post I discuss three topics, which the students I advise sometimes find challenging. Specifically, we will look at how we can elicit information using Likert items, Likert scales and Forced choice selection.

Contents of this post

Notepad, laptop and charts displaying quantiatitive data. — Questionnaire surveys are simple, straightforward, and efficient, provided you avoid common mistakes. (Photo by Lukas on Pexels.com)

What are Likert items?

Likert items (and scales; see below) are very good at measuring constructs like beliefs and attitudes. A Likert item consists of a statement followed by a list of possible responses. The list is bivalent and symmetrical, and the responses are often anchored to numerical descriptors. Here’s an example:

Example 1
The next Doctor Who should be cast as a female role.
1=Strongly Agree, 2=Agree, 3=Not sure, 4=Disagree, 5=Strongly Disagree

In this example, the item consists of a statement and five response options. The list of options is bivalent: This means that it extends in the directions of both ‘agreement’ and ‘disagreement’. The responses are symmetrically arranged around a neutral value (‘not sure’). In some Likert items, the neutral value might not be explicitly stated (scroll down to see why), but the list is still symmetrical. Note, however, that scales with percentages, or scales ranging from ‘never’ to ‘always’ (while valid for other purposes) are not Likert scales.

Likert scales must be bivalent and symmetrical

Do Likert scales produce ordinal or interval data?

Some statisticians argue that response options such as the one shown above are evenly spaced (equidistant), or that we can at least pretend that they are. This makes intuitive sense, and it is necessary for running a number of useful statistical tests. For instance, we can then calculate the mean, or weighted average, of the responses.

However, to do this, you would need to make three assumptions. Specifically, you must assume that:

psychological constructs can be measured with precision;
such precision can be linguistically mapped; and
all respondents will interpret the descriptors in a similar way.

All this seems a lot like wishful thinking to me, and I would rather adjust research methods to reality, rather than visa-versa.

It is much safer to treat the information that these scales produce as ordinal data for the purposes of analysis. This means that we can calulate the frequency of each response (e.g., how many people strongly agree), we can add responses (e.g., how many people express some form of agreement, whether it is strong or moderate), and we can calculate percentages. We can also calculate the central tendency using the median, and the spread of responses, using the total and interquartile ranges of responses (here’s how to do this). For most research in language education, this is quite enough.

Pocket calculator, pen and notes. — Likert items produce data that are easy to both use and abuse (Photo by Pixabay on Pexels.com)

Likert items work best in groups

Like most quantitative methods, Likert items can efficiently generate lots of data. On the other hand, these data can be misleading, because the questions are very sensitive to the wording of the items.

For example, there is strong empirical evidence showing that support for free speech in the US is much higher when the questions contain the word ‘forbid’ rather than ‘not allow’ (a phenomenon known as the ‘forbid/allow asymmetry’). Even though the words are logical opposites, they elicit different responses: Participants generally object to ‘forbidding’ free speech, but they are less strongly opposed to ‘not allowing’ some forms of expression.

Likert items are very sensitive to wording

To moderate for the effect of item wording, it is best to use several variants of the same item in a questionnaire, and derive a composite score from the responses. Here’s one way to do this:

Example 2

4. I enjoy science fiction shows.
1=Strongly Agree, 2=Agree, 3=Disagree, 4=Strongly Disagree

12. I watch science fictions shows whenever I can.
1=Strongly Agree, 2=Agree, 3=Disagree, 4=Strongly Disagree

16. I am a science fiction fan.
1=Strongly Agree, 2=Agree, 3=Disagree, 4=Strongly Disagree

17. I dislike science fiction.
1=Strongly Agree, 2=Agree, 3=Disagree, 4=Strongly Disagree

A cluster of such related items, which probe the same underlying construct, produces a Likert scale. The items that make up a Likert scale, by the way, don’t need to be presented together in your questionnaire. In fact, you might find it better to spread them out, so that their sequencing does not influence responses. In Example 2, I have used random numbers before each item, to simulate spread in a larger questionnaire.

Preparing a scale for analysis

To ensure that all the items in the Likert scale measure the same construct, it is necessary to pilot the scale with a relatively large number of participants. We can then calculate the internal consistency of the scale (using a metric called Cronbach’s alpha), and eliminate any items that do not work well. Looking at Example 2 (above), item 12 is suspect, because it seems to measure behaviour rather than attitudes. If piloting suggests that the scale works better without it, then we would have to remove the item from the final version of the questionnaire (or, depending on when we found out, we might have to remove the data it produced from our calculations).

Deriving a composite score

Deriving the score of a Likert scale involves three steps. First, we reverse any negatively worded items. Number 17, above, is negatively worded, so we will code ‘Strongly disagree’ as 4, and so on. Next, we remove from the scale any item that systematically generates different responses from the others (see previous paragraph). Finally, we add the score that was produced – in this case, a number ranging from 4 to 16. Alternative techniques, like assigning each response a value from 0 to 3, or from -2 to +2 are also fine, but it is important to be transparent about what you did, so make sure you document every step of the process and report it in your ‘methods’ section.

Analyst holding a tablet, where quantitative data are displayed — When designing a questionnaire, how many options should you give respondents? (Photo by rawpixel.com on Pexels.com)

Using forced choice

Most commonly, Likert items contain five (or seven) options, which are arranged around a neutral response such as ‘neither agree or disagree’. This beautifully symmetric format can give rise to the ‘central tendency bias’, which is what happens when participants systematically select the uncontroversial middle option. This might happen because of respondent fatigue, or sometimes it is a deliberate strategy by respondents who want to avoid committing to an opinion. Either way, such responses give very few usable insights, so we may want to discourage them.

One of the simplest ways to counteract the central tendency bias is to use scales with an even number of responses. In Example 2 (above) I used forced-choice (or ipsative) items. These are items from which the ‘neutral’ option has been removed, leaving an even number of options. This forces participants to either agree or disagree with the statement. Forced-choice items with a small number of responses are very effective in eliciting attitudes that participants might otherwise feel inclined to suppress.

About me

Achilleas Kostoulas teaches applied linguistics and language teacher education courses at the University of Thessaly in Greece. He holds a PhD and an MA in TESOL from the University of Manchester, UK, and a BA in English Studies from the University of Athens.

About this post

This post was originally written in February 2014. It was last revised in August 2025. The cover image is from Adobe Stock (with license); other images as stated. The content of the post does not represent the views of my past or current employers.