Designing Better Questionnaires: Using scales

This is the fourth in a series of five introductory blog posts on questionnaire design. In previous posts, I talked about question wording, possible bias, responses and sequencing of questions, and I discussed the demographics section of questionnaires. In this post, I will look into how Likert items, Likert scales, and forced choice can be used to elicit information from respondents.

"A volunteer fills out a questionnaire"
“A volunteer fills out a questionnaire” (c) Plings [CC-attribution license]
Source: http://www.flickr.com/photos/plings/4453697565/

What are Likert items?

Likert items consist of a statement followed by a number of responses that bivalent and symmetrical. These are often anchored to numerical descriptors, in the form of consecutive integers. Here’s an example:

Example 1
Matt Smith was the best Doctor to date.
1=Strongly Agree, 2=Agree, 3=Not sure, 4=Disagree, 5=Strongly Disagree

In the example above, the scale extends towards ‘agreement’ and ‘disagreement’, i.e., it is bivalent, and responses are symmetrically arranged around a neutral value (‘not sure’). Note that I have used ‘item’, rather than ‘scale’, for reasons that will become clearer later on.

Some statisticians argue that responses should be ‘evenly’ spaced. This makes intuitive sense, and it is a necessary assumption for running a number of useful statistical tests. However, to do this  you would need to accept that attitudes can be measured with precision, that such precision can be linguistically mapped, and that all respondents will interpret the descriptors in a similar way. In my opinion, all this is just wishful thinking, so rather than worry about spacing responses evenly, I suggest that we treat the information these scales produce as ordinal data, i.e., adjust our research methods to reality, rather than vice versa.

Likert items work best in groups

Like most quantitative methods, Likert items can efficiently generate lots of data; on the other hand, they are very sensitive to the wording of the statements in the questionnaire. To illustrate by means of a classic example: there’s research going back to the 1940s proving that support for free speech in the US is much higher when the questions contain the word ‘forbid’ rather than ‘not allow’, even though the words are logical opposites. That is, respondents seem to be against ‘forbidding’ free speech, but are not as strongly opposed to ‘not allowing’ some forms of expression (a phenomenon known as the ‘forbid/allow asymmetry’). To moderate for the effect of item wording, it is best to use several variants of the same item in a questionnaire, and derive a composite score from the responses. Here’s one way to do this:

Example 2
4. I enjoy science fiction shows.
1=Strongly Agree, 2=Agree, 3=Disagree, 4=Strongly Disagree
12. I watch science fictions shows whenever I can.
1=Strongly Agree, 2=Agree, 3=Disagree, 4=Strongly Disagree
16. I am a science fiction fan.
1=Strongly Agree, 2=Agree, 3=Disagree, 4=Strongly Disagree
17. I dislike science fiction.
1=Strongly Agree, 2=Agree, 3=Disagree, 4=Strongly Disagree

A cluster of such related items, which probe the same underlying construct, produces a Likert scale. The items that make up a Likert scale, by the way, don’t need to be clustered together. In fact, it may be advantageous to spread them out across the questionnaire, so that their sequencing does not influence responses.

To derive the score of a Likert scale, you would need to (a) reverse any negative items, (b) remove from the scale any item that systematically generates different responses from the others, and (c) calculate the central tendency of the responses each participant provided. If you assumed that the options were equidistant, you might calculate the mean, but I strongly suggest using the median instead. Using the example above, let us assume that a participant responded with 1, 2, 1, 4. After reversing item 17, which is negatively worded, the median of the responses is 1.

Using forced choice

Most commonly, Likert items contain five (or seven) options which are arranged around a neutral response such as ‘neither agree or disagree’. This beautifully symmetric format can give rise to the ‘central tendency bias’, which is what happens when participants systematically select the uncontroversial middle option. Whether this is due to respondent fatigue, or constitutes a deliberate strategy to avoid expressing an opinion, such responses give very few usable insights, so we need to discourage them.

One of the simplest ways to counteract the central tendency bias is to use scales with an even number of responses. In Example 2, above, I used forced-choice (or ipsative) items: these are items from which the ‘neutral’ option has been removed, thus forcing participants either agree or disagree with the statement. Forced-choice items with a small number of responses are very effective in eliciting attitudes that participants might otherwise feel inclined to suppress.

Further reading

In this blog, I have written extensively about Likert scaling. Some relevant posts are:

Other online resources which you may wish to consult are listed below:

~

In the next, final, post to this series, I will discuss ways to make your questionnaire layout more effective. Till then!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s