How to summarise Likert scale data using SPSS

This post will give you some advice about using SPSS to summarise data that were generated with a Likert scale. If you want to read up on Likert scales before you go on, you can find some information in this post.

Contents of this post

Before we start

Why should you summarise Likert scale data

Elsewhere in this blog, I have written that a Likert scale might consist of several items that measure a similar underlying construct (a latent variable). For instance, if I want to measure people’s attitudes towards sweets, I might ask them to record what they think about the following statements:

1. I like chocolate	Strongly Agree	Agree	Disagree	Strongly Disagree
2. I like cookies	Strongly Agree	Agree	Disagree	Strongly Disagree
3. Ι Iike whipped cream	Strongly Agree	Agree	Disagree	Strongly Disagree

What is the latent variable here?

In order to interpret these data, we need to summarise the data in the scale. We can do this in two ways: adding the data or estimating the median. In this post, I will show you how to estimate the median, because this is slightly harder. The same steps can be modified to add up the data.

Using the same example as above, I need to create a new ‘super-variable’, which shows the mean of items (1), (2) and (3) for each respondent.

My assumptions about you

I assume that you will already know how to define variables and values, how to toggle between the numerical expression and verbal descriptor of the values (i.e., you can make SPSS show responses as “strongly agree/agree/disagree/strongly disagree” or as “1/2/3/4”), and how to key in data. I will also assume that you have already established that your scale is internally consistent, so I will focus only on the technical aspects of merging the variables.

Here’s how to merge the Likert items

Starting out

Your starting point for summarising Likert scale data with SPSS will be a dataset similar to the one shown in Figure 1, below.

Fig. 1 SPSS screenshot showing responses to Likert-type items

When you have created the dataset by typing your data into SPSS, and after you have tested for the internal consistency of the scale (use Cronbach’s α), it’s time to create a new variable.

Merging the variables

From the top menu bar in SPSS, select Transform -> Compute variable. You should now see the following dialogue box.

SPSS screenshot showing four steps for combining Likert-type responses — Fig. 2 Four steps for combining Likert type responses

Assign a name to the new variable (e.g., Sweets);
Scroll down the Function Group, and select Statistical;
From the functions that appear select the Median. [ΝΒ it is possible to select the mean, but I don’t recommend it]. At this point, the following formula should appear in the numerical expression box: Median ( , )
Place the cursor in the brackets, select the variables you want to merge, and click on the arrow. Repeat with all the variables, separating them with comas.
Click on OK.

Your new Likert scale

SPSS will automatically generate a new variable, which will appear at the end of your dataset. This will be in numerical form (1, 2, 3, …), but you can change it to a verbal descriptor for consistency (Figure 3). You can use this variable for descriptive statistics (e.g., estimate the central tendency and dispersion), cross-tabulations, correlations and so on…

SPSS screenshot showing the combined scale — Fig. 3 The new variable

Now wasn’t that very easy?

Frequently Asked Questions

Over time, a lot of people have asked questions about Likert scales in the comments section of this post. I have collected the most usual things people ask in this section.

There are decimal points in the median I calculated. Is that a problem?

If your median falls between two values, it will have a ‘half’ (e.g., 2.5, 4.5 etc.). This is normal. You can report the median as you see it.

The data produced by Likert type items are, strictly speaking, ordinal data. That means that they can tell us how to rank responses (strongly agree is ‘more’ agreement than agree), but they do not give us information about the distance between them (strongly agree is not twice as much agreement as agree). Think of the medals in the Olympics: they can tell you if an athlete came first, second or third, but you cannot use them to calculate average speed. The median is a cruder statistic than the mean, because it does not take into account the ‘distance’ or ‘weighting’ of responses. In this case though, it is the best statistic we can legitimately use because this ‘distance’ is unknown.

OK, I did what you said, but what should I do next with my study?

It’s hard to answer such a question without knowing more about what you’re trying to find out (your research question) and your data. This is the kind of question that your advisor or mentor will be better qualified to answer.

Where can I find out more information about all this?

There are many statistics manuals you could read, if you want to follow up on the information in this post. My personal favourite is Andy Field’s Discovering Statistics with SPSS. I have also written some more posts about quantitative research below, which you might find useful:

Likert scales: Four things you may not know

If you use quantitative methods in your research project, you may want to read this first.

Keep reading

How to interpret Ordinal Data: Median and Interquartile Range for Likert Scales

Let’s assume that you have prepared a questionnaire, where respondents had to select among responses ranging from “strongly agree” to “strongly disagree”. For convenience, you have probably followed the established practice of replacing these responses with numbers: “1” for “strongly disagree”, “2” for “agree” and so on. How do you go about analysing these data?

Keep reading

How to use Likert scales effectively

Many questionnaires use Likert items & scales to elicit information about language teaching and learning. In this post, I discuss how to use these instruments effectively, by looking into the difference between items and scales, and explaining how to analyse the data that they produce.

Keep reading

Before you go

If you landed on this page while preparing for one of your student projects, I wish you all the best with your work. There’s a range of social sharing buttons below in case you feel like sharing this information among fellow students who might also find it useful. Also feel free to ask any other questions you may have, using the contact form.

Achilleas Kostoulas

Achilleas Kostoulas is an applied linguist language teacher educator. He teaches at the Department of Primary Education at the University of Thessaly, Greece. Previous academic affiliations include the University of Graz, Austria, and the University of Manchester, UK (which is also where he was awarded his PhD). He has extensive experience teaching research methods in the context of language teacher education.

Contact me

Comments

36 responses to “How to summarise Likert scale data using SPSS”

Sarah

14 May 2015

Thanks aloooooooot for ur help

Loading…

Reply
Sergio Otero

9 October 2015

Thanks for the info… I have been looking for discussions about merging likert scales, and also youtube videos bust they all use means but do not explain why, this is the first time I heard using median, which makes a lot of sense.

Loading…

Reply
Shakku YaKamwiya

12 December 2017

Very useful ideas especially for post graduate students. Many thanks Achilleas.

Loading…

Reply
Achilleas Kostoulas

14 June 2018

You could, if you can convincingly argue that the distance between the anchor points is equal. This link might help: http://www.statisticshowto.com/likert-scale-definition-and-examples/

Loading…

Reply
Jenny

1 August 2018

Hi there Achilleas, thanks so much for your posts and helpful responses. I have a question in relation to your recent posts above regarding decimal results in the medians. If we accept the decimal results, does this not negate the reasons why we are using medians (rather than mean) for the ordinal data? The decimal median results are assuming the distance between each number is 1 (rather than being unknown and potentially variable).

Also, the only decimal I appear to have after computing the median is a .5 one; that seems a big jump to me to round it up to the next whole number (so as to make the data ordinal)!

Finally, I note that “mode” is not listed within the statistical functions in SPSS, I guess because it is not a calculation. I think I would feel most comfortable with working with a new variable that would create the mode value of a set of Liker-scale responses for each respondent. Do you know of a way that this is possible?

I would be grateful for your thoughts on these three matters.

Jenny :)

Loading…

Reply
1. Achilleas Kostoulas
  
  1 August 2018
  
  Hi Jenny,
  
  Thanks for your comment. You’re quite right, the only decimal you can get when calculating the median is .5 , which is going to happen occasionally if you have an even number of responses. In this case, you just report the decimal, and you should not round it up. As you correctly point out, this is not really a decimal in the same sense as it would be with the mean; rather it’s just a conventional way of showing where the central tendency lies.
  
  I’m not sure which version of SPSS you’re using, but have you checked under ‘frequencies’? I think that you should be able to find the mode there.
  
  Hope that helps, and good luck with your project!
  
  Loading…
  
  Reply
  1. Jenny
    
    1 August 2018
    
    Thanks so much for this help and advice Achilleas :) Any idea of a source I could reference to back-up this advice? It’s like looking for the proverbial needle in a haystack when searching for such specific details in a big book (and even online). As much as I’d love to be able to cite your blog, I’m not sure how well my supervisor will mark me on that :///// sorry!
    
    P.S. I’m using SPSS 25.
    
    Thank you so much! :)
    Jenny
    
    Loading…
  2. Achilleas Kostoulas
    
    1 August 2018
    
    No worries about that, and I’d love to point you to some literature, but I’m out of office and don’t have access to my books, so I can’t be as specific as I would otherwise have been. I’m quite sure there will be something helpful in Andy Field’s Discovering Statistics, and in Daniel Muijs’ Doing Quantitative Research in Education, for all that’s worth
    
    Loading…
  3. Jenny
    
    1 August 2018
    
    Dear Achilleas,
    
    Thanks very much indeed. A library near to me has those books, so I shall go fetch them. If I struggle to find a reference, I’ll come back to you. (I’m not filled with hope at the ease of finding back-up for such specific advice from looking briefly at the books’ contents from Amazon’s “look inside” feature (at least without reading the whole book) but hopefully when the whole book is in front of me, I will be proved wrong :)
    
    In the meantime…. please can you confirm that you think it is safe for me to proceed with non-paramtric tests of my ordinal Likert data using a new MEDIAN Likert variable? I also plan to use the MEDIAN Likert variables from two different ordinal Likert scales to test any correlation between these two scales (i.e., using Spearman’s Rho).
    
    Many thanks once again Achilleas :)
    
    Jenny :)
    
    Loading…
  4. Achilleas Kostoulas
    
    2 August 2018
    
    It all seems quite reasonable! Good luck with the project :)
    
    Loading…
  5. Jenny
    
    2 August 2018
    
    Thank you Achilleas! I think I was over-worrying :)
    
    Loading…
Jenny

15 August 2018

Dear Achilleas,
It is Jenny here again – we already exchanged some messages recently. I now have Andy Field’s Discovering Statistics book and some other quantitative analysis books; however, cannot find an explicit mentioning (for reference/citation purposes in my dissertation) that it is OK or advisable to compute a new median score variable for Likert Data and to then use this in non-parametric tests. They are big books though, so without reading the whole book cover-to-cover, I cannot guarantee that I’ not missing anything.
Are you with your books yet and able to confirm your source for this recommendation?
Many thanks indeed and sorry to bother you again :)
Jenny :)

Loading…

Reply
1. Achilleas Kostoulas
  
  15 August 2018
  
  Hi Jenny,
  
  I think you’ll find what you’re looking for in Muijs, D. (2004). Doing Quantitative Research in Education with SPSS. Thousand Oaks, CA: SAGE , pp 99-100.
  
  Hope that helps :)
  
  Loading…
  
  Reply
Jenny

15 August 2018

No worries anymore Achilleas! Typical – just found a source :) thank you!

Loading…

Reply
1. Achilleas Kostoulas
  
  15 August 2018
  
  Super! Good luck with your submission!
  
  Loading…
  
  Reply
dinhorocks

10 November 2018
and i have 167 respondent in total. my advisor is out of reach so i need your advice…i have computed data as you have shown above and crosstab between consumer behaviour and brand image

and output of SPSS as follows;

Consumerbehaviour * Brandimage Crosstabulation
Brandimage
Consumerbehaviour SA A N D SD Total
SA 5 8 0 0 0 13
A 5 18 30 0 2 55
N 5 32 42 2 0 81
D 0 0 1 6 0 7
SD 0 5 0 2 4 11
Total 15 63 73 10 6 167

SA= Strongly Agree, A= Agree N= Neutral , D Disagree, SD Strongly Disagree

Chi-Square Tests
Value df Asymp. Sig. (2-sided)
Pearson Chi-Square 153.983a 16 .000
Likelihood Ratio 95.235 16 .000
Linear-by-Linear
Association 26.810 1 .000
N of Valid Cases 167
```
         a. 19 cells (76.0%) have expected count less than 5. The minimum 
               expected count is .25.           
```
is this Okay? please help me sir…

Loading…
Reply
1. Achilleas Kostoulas
  
  10 November 2018
  
  No, it doesn’t look good. You have too many categories and not enough data.
  
  If you notice, under the crosstab there’s a line that says that 76% if your cells have an expected count <5. This is too high and it skews your statistics.
  
  There are three options, now: (a) report the data as are, and suggest that readers exercise caution in the interpretation; (b) get more data; or (c) consolidate your categories, by merging values like ‘agree’ and ‘strongly agree’.
  
  I am sorry your advisor is out of reach, but you should really talk to him or her, if they are statistically competent – this looks like it needs lots of help.
  
  Loading…
  
  Reply
Eva

7 March 2019

Hi,

I am doing a study with two independent groups – they are seeing different types of an advertisement

I want to know if there is difference between their cognitive responses.
For that I have 3 lists (for three different concepts of cognitive responses) each with 10 questions answered on a 7pt likert scale. I have computed them together into three seperate new variables. I want to know, if I don’t select the median like you do here, do I then get a sum of all the scores of each participant? Is it ok to do that and then use an independent samples T-test to compare the means of the two groups? Or should I select the median like you show and then use Wilcoxon mann-whitney to compare the responses to the advertisements between groups?

I also measured behavioral intentions with 12 questions also answered on a 7pt Likert scale. I want to know if I can predict intentions from the advertisement people saw and if the relationship can be explained by the cognitive responses. What statistical test should I use?

Loading…

Reply
1. Achilleas Kostoulas
  
  7 March 2019
  
  >>> I want to know, if I don’t select the median like you do here, do I then get a sum of all the scores of each participant?
  
  Sounds fine.
  
  >>Is it ok to do that and then use an independent samples T-test to compare the means of the two groups? Or should I select the median like you show and then use Wilcoxon mann-whitney to compare the responses to the advertisements between groups?
  
  The Mann Whitney test is a safer choice here.
  
  >>I also measured behavioral intentions with 12 questions also answered on a 7pt Likert scale. I want to know if I can predict intentions from the advertisement people saw and if the relationship can be explained by the cognitive responses. What statistical test should I use
  
  This is the kind of question that would be best answered by your supervisor.
  
  Loading…
  
  Reply
  1. Eva Ýr Heiðberg
    
    7 March 2019
    
    Great, thank you for the help and a quick response! :)
    
    Loading…
Diriba Mangasha Dabala

31 December 2019

Dear, i want a clarification from you. I have a 5 Independent variables which are sub-classified in to two,three and four.And i have around 34 dependent variables.These independent variables are classified on 5 Likert scale,which include Strongly Agree, Agree,Neutral,Disagree and Strongly.So, i want to analyse the differences among groups by using infrential statistics.So, which test is best to use?
Note:My questionnaire were distributed randomly.

Loading…

Reply
1. Achilleas Kostoulas
  
  31 December 2019
  
  That would depend on the type of variables, i.e. whether they are nominal, ordinal, or scale, the questions you want answered, the number of values per variable, and -possibly- the number of responses you’ve got. It’s really hard to tell without knowing more about your research project and your data. If you’re in a supervised study programme, your advisor would be better placed to answer this question.
  
  Loading…
  
  Reply
  1. Dereje shefera shewaget
    
    22 June 2020
    
    Interesting advice thank you
    I have one question …during mean calculation for variables, let say it is 3.00. so can we put an interpretation based on it….3 is neutral or undecided
    
    Loading…
  2. Achilleas Kostoulas
    
    22 June 2020
    
    Ideally, you should have a verbal descriptor to go with each number. If you don’t, then either I interpretation is kind of shaky. If this is an aggregate of many responses, I’d venture towards ‘neutral’
    
    Loading…
Amal George

16 June 2020

Which statistical tool should I use in SPSS to find whether there is relation between variables, if the 1 and only dependent variable is Likert-Scale and independent variables are categorical(6 variables) ?

Loading…

Reply
Shannon

16 June 2020

Hi! I just want to ask, after summarising the data, what type of variable would it be? Scale or is it still ordinal?
Can I use it to do a Spearman correlation analysis and do I need to summarise the other categorical data?

Loading…

Reply
1. Achilleas Kostoulas
  
  22 June 2020
  
  Hi! Strictly speaking, it’s still an ordinal variable but some people argue that the data behave as if it’s an ‘interval’ scale. This is a special kind of scale, where the possible values have fixed distances. If you make that assumption for both variables, it should be possible to run a correlation analysis. Hope that helps.
  
  Loading…
  
  Reply
Mark

25 June 2020

Thanks Achilleas, that helped a lot. I now know how to proceed. Thanks. Appreciate the work you do!

Loading…

Reply
Bornwell

7 August 2020

Good day, i have a questionnaire with both agreement statements i.e. (1 = Strongly Agree, 2 = Agree, 3 = Undecided, 4 = Disagree, 5 = Strongly Disagree) and with frequency statements i,e (1: Never to 5: Always). How do I summarise the data

Loading…

Reply
1. Achilleas Kostoulas
  
  7 August 2020
  
  You don’t. These statements do not seem to measure the same construct.
  
  Loading…
  
  Reply
Quincy Clark

5 December 2020

Hello Achilleas,

The pre- and post-Likert scale data that I am working with was administered in 2017, 2018, and 2019. What is the most effective way to run internal consistency? Would I run internal consistency on pre and post one year at a time? Or one report that includes all three years for pre and all three years for post?

Thank you
Quincy Clark
q3nans@gmail.com

Loading…

Reply
1. Achilleas Kostoulas
  
  5 December 2020
  
  Hi, and thanks for reaching out. It really depends on what you’re trying to do with your data, i.e., what your research questions are. If you’re just trying to show that the scales are internally consistent, I think that aggregating all the data and running a Cronbach alpha test should be OK.
  
  Loading…
  
  Reply
casscass

8 December 2020

I like this post, it’s simple and easy to understand!

Loading…

Reply
1. Achilleas Kostoulas
  
  8 December 2020
  
  I imagine it would, as long as the items you summarise are measuring the same thing, more or less, i.e., the scale is internally consistent
  
  Loading…
  
  Reply
wada

27 January 2026

how to do in pre and post ?

Loading…

Reply
1. Achilleas Kostoulas
  
  27 January 2026
  
  You can start by saying things like “thank you” and “please”.
  
  As far as the statistics are concerned, your safest choice would be a Wilcoxon signed-rank test, from which you can report the statistic (W or Z), the p-value (for statistical significance) and the effect size. You can report this using phrasing like:
  
  Methodology: “Pre- and post-test scores were compared using a Wilcoxon signed-rank test due to the ordinal nature of the data.”
  Results: “Post-test scores were significantly higher than pre-test scores, Z = −2.41, p = .016, with a medium effect size (r = .38).”
  
  Loading…
  
  Reply