This post will give you some advice about using SPSS to summarise data that were generated with a Likert scale. If you want to read up on Likert scales before you go on, you can find some information in this post.
Before we start
Why should you summarise Likert scale data
Elsewhere in this blog, I have written that a Likert scale might consist of several items that measure a similar underlying construct (a latent variable). For instance, if I want to measure people’s attitudes towards sweets, I might ask them to record what they think about the following statements:
|1. I like chocolate||Strongly Agree||Agree||Disagree||Strongly Disagree|
|2. I like cookies||Strongly Agree||Agree||Disagree||Strongly Disagree|
|3. Ι Iike whipped cream||Strongly Agree||Agree||Disagree||Strongly Disagree|
In order to interpret these data, we need to summarise the data in the scale. We can do this in two ways: adding the data or estimating the median. In this post, I will show you how to estimate the median, because this is slightly harder. The same steps can be modified to add up the data.
Using the same example as above, I need to create a new ‘super-variable’, which shows the mean of items (1), (2) and (3) for each respondent.
My assumptions about you
I assume that you will already know how to define variables and values, how to toggle between the numerical expression and verbal descriptor of the values (i.e., you can make SPSS show responses as “strongly agree/agree/disagree/strongly disagree” or as “1/2/3/4”), and how to key in data. I will also assume that you have already established that your scale is internally consistent, so I will focus only on the technical aspects of merging the variables.
Here’s how to merge the Likert items
Your starting point for summarising Likert scale data with SPSS will be a dataset similar to the one shown in Figure 1, below.
When you have created the dataset by typing your data into SPSS, and after you have tested for the internal consistency of the scale (use Cronbach’s α), it’s time to create a new variable.
Merging the variables
From the top menu bar in SPSS, select Transform -> Compute variable. You should now see the following dialogue box.
- Assign a name to the new variable (e.g., Sweets);
- Scroll down the Function Group, and select Statistical;
- From the functions that appear select the Median. [ΝΒ it is possible to select the mean, but I don’t recommend it]. At this point, the following formula should appear in the numerical expression box: Median ( , )
- Place the cursor in the brackets, select the variables you want to merge, and click on the arrow. Repeat with all the variables, separating them with comas.
- Click on OK.
Your new Likert scale
SPSS will automatically generate a new variable, which will appear at the end of your dataset. This will be in numerical form (1, 2, 3, …), but you can change it to a verbal descriptor for consistency (Figure 3). You can use this variable for descriptive statistics (e.g., estimate the central tendency and dispersion), cross-tabulations, correlations and so on…
Now wasn’t that very easy?
Frequently Asked Questions
Over time, a lot of people have asked questions about Likert scales in the comments section of this post. I have collected the most usual things people ask in this section.
There are decimal points in the median I calculated. Is that a problem?
If your median falls between two values, it will have a ‘half’ (e.g., 2.5, 4.5 etc.). This is normal. You can report the median as you see it.
Why you do not recommend grouping the Likert scales as means and you recommend using medians?
The data produced by Likert type items are, strictly speaking, ordinal data. That means that they can tell us how to rank responses (strongly agree is ‘more’ agreement than agree) , but they do not give us information about the distance between them (strongly agree is not twice as much agreement as agree). Think of the medals in the Olympics: they can tell you if an athlete came first, second or third, but you cannot use them to calculate average speed. The median is a cruder statistic than the mean, because it does not take into account the ‘distance’ or ‘weighting’ of responses. In this case though, it is the best statistic we can legitimately use because this ‘distance’ is unknown.
OK, I did what you said, but what should I do next with my study?
It’s hard to answer such a question without knowing more about what you’re trying to find out (your research question) and your data. This is the kind of question that your advisor or mentor will be better qualified to answer.
Where can I find out more information about all this?
There are many statistics manuals you could read, if you want to follow up on the information in this post. My personal favourite is Andy Field’s Discovering Statistics with SPSS. I have also written some more posts about quantitative research below,, which you might find useful:
Likert scales: Four things you may not know
If you use quantitative methods in your research project, you may want to read this first.
How to interpret ordinal data
Every now and then I tend to get questions about statistics from readers of this blog — this is due to a somewhat ill-deserved reputation Google seems to have bestowed on me as an ‘expert’ in Likert scale measurement. Many of the answers you need can be found in this post, and this set of…
How to use Likert scales effectively
Many questionnaires use Likert items & scales to elicit information about language teaching and learning. In this post, I discuss how to use these instruments effectively, by looking into the difference between items and scales, and explaining how to analyse the data that they produce.
Before you go
If you landed on this page while preparing for one of your student projects, I wish you all the best with your work. There’s a range of social sharing buttons below in case you feel like sharing this information among fellow students who might also find it useful. Also feel free to ask any other questions you may have, using the contact form.
Thanks aloooooooot for ur help
Thanks for the info… I have been looking for discussions about merging likert scales, and also youtube videos bust they all use means but do not explain why, this is the first time I heard using median, which makes a lot of sense.
Very useful ideas especially for post graduate students. Many thanks Achilleas.
You could, if you can convincingly argue that the distance between the anchor points is equal. This link might help: http://www.statisticshowto.com/likert-scale-definition-and-examples/
Hi there Achilleas, thanks so much for your posts and helpful responses. I have a question in relation to your recent posts above regarding decimal results in the medians. If we accept the decimal results, does this not negate the reasons why we are using medians (rather than mean) for the ordinal data? The decimal median results are assuming the distance between each number is 1 (rather than being unknown and potentially variable).
Also, the only decimal I appear to have after computing the median is a .5 one; that seems a big jump to me to round it up to the next whole number (so as to make the data ordinal)!
Finally, I note that “mode” is not listed within the statistical functions in SPSS, I guess because it is not a calculation. I think I would feel most comfortable with working with a new variable that would create the mode value of a set of Liker-scale responses for each respondent. Do you know of a way that this is possible?
I would be grateful for your thoughts on these three matters.
Thanks for your comment. You’re quite right, the only decimal you can get when calculating the median is .5 , which is going to happen occasionally if you have an even number of responses. In this case, you just report the decimal, and you should not round it up. As you correctly point out, this is not really a decimal in the same sense as it would be with the mean; rather it’s just a conventional way of showing where the central tendency lies.
I’m not sure which version of SPSS you’re using, but have you checked under ‘frequencies’? I think that you should be able to find the mode there.
Hope that helps, and good luck with your project!
Thanks so much for this help and advice Achilleas :) Any idea of a source I could reference to back-up this advice? It’s like looking for the proverbial needle in a haystack when searching for such specific details in a big book (and even online). As much as I’d love to be able to cite your blog, I’m not sure how well my supervisor will mark me on that :///// sorry!
P.S. I’m using SPSS 25.
Thank you so much! :)
No worries about that, and I’d love to point you to some literature, but I’m out of office and don’t have access to my books, so I can’t be as specific as I would otherwise have been. I’m quite sure there will be something helpful in Andy Field’s Discovering Statistics, and in Daniel Muijs’ Doing Quantitative Research in Education, for all that’s worth
Thanks very much indeed. A library near to me has those books, so I shall go fetch them. If I struggle to find a reference, I’ll come back to you. (I’m not filled with hope at the ease of finding back-up for such specific advice from looking briefly at the books’ contents from Amazon’s “look inside” feature (at least without reading the whole book) but hopefully when the whole book is in front of me, I will be proved wrong :)
In the meantime…. please can you confirm that you think it is safe for me to proceed with non-paramtric tests of my ordinal Likert data using a new MEDIAN Likert variable? I also plan to use the MEDIAN Likert variables from two different ordinal Likert scales to test any correlation between these two scales (i.e., using Spearman’s Rho).
Many thanks once again Achilleas :)
It all seems quite reasonable! Good luck with the project :)
Thank you Achilleas! I think I was over-worrying :)
It is Jenny here again – we already exchanged some messages recently. I now have Andy Field’s Discovering Statistics book and some other quantitative analysis books; however, cannot find an explicit mentioning (for reference/citation purposes in my dissertation) that it is OK or advisable to compute a new median score variable for Likert Data and to then use this in non-parametric tests. They are big books though, so without reading the whole book cover-to-cover, I cannot guarantee that I’ not missing anything.
Are you with your books yet and able to confirm your source for this recommendation?
Many thanks indeed and sorry to bother you again :)
I think you’ll find what you’re looking for in Muijs, D. (2004). Doing Quantitative Research in Education with SPSS. Thousand Oaks, CA: SAGE , pp 99-100.
Hope that helps :)
No worries anymore Achilleas! Typical – just found a source :) thank you!
Super! Good luck with your submission!
and i have 167 respondent in total. my advisor is out of reach so i need your advice…i have computed data as you have shown above and crosstab between consumer behaviour and brand image
and output of SPSS as follows;
Consumerbehaviour * Brandimage Crosstabulation
Consumerbehaviour SA A N D SD Total
SA 5 8 0 0 0 13
A 5 18 30 0 2 55
N 5 32 42 2 0 81
D 0 0 1 6 0 7
SD 0 5 0 2 4 11
Total 15 63 73 10 6 167
SA= Strongly Agree, A= Agree N= Neutral , D Disagree, SD Strongly Disagree
Value df Asymp. Sig. (2-sided)
Pearson Chi-Square 153.983a 16 .000
Likelihood Ratio 95.235 16 .000
Association 26.810 1 .000
N of Valid Cases 167
is this Okay? please help me sir…
No, it doesn’t look good. You have too many categories and not enough data.
If you notice, under the crosstab there’s a line that says that 76% if your cells have an expected count <5. This is too high and it skews your statistics.
There are three options, now: (a) report the data as are, and suggest that readers exercise caution in the interpretation; (b) get more data; or (c) consolidate your categories, by merging values like ‘agree’ and ‘strongly agree’.
I am sorry your advisor is out of reach, but you should really talk to him or her, if they are statistically competent – this looks like it needs lots of help.
I am doing a study with two independent groups – they are seeing different types of an advertisement
I want to know if there is difference between their cognitive responses.
For that I have 3 lists (for three different concepts of cognitive responses) each with 10 questions answered on a 7pt likert scale. I have computed them together into three seperate new variables. I want to know, if I don’t select the median like you do here, do I then get a sum of all the scores of each participant? Is it ok to do that and then use an independent samples T-test to compare the means of the two groups? Or should I select the median like you show and then use Wilcoxon mann-whitney to compare the responses to the advertisements between groups?
I also measured behavioral intentions with 12 questions also answered on a 7pt Likert scale. I want to know if I can predict intentions from the advertisement people saw and if the relationship can be explained by the cognitive responses. What statistical test should I use?
>>> I want to know, if I don’t select the median like you do here, do I then get a sum of all the scores of each participant?
>>Is it ok to do that and then use an independent samples T-test to compare the means of the two groups? Or should I select the median like you show and then use Wilcoxon mann-whitney to compare the responses to the advertisements between groups?
The Mann Whitney test is a safer choice here.
>>I also measured behavioral intentions with 12 questions also answered on a 7pt Likert scale. I want to know if I can predict intentions from the advertisement people saw and if the relationship can be explained by the cognitive responses. What statistical test should I use
This is the kind of question that would be best answered by your supervisor.
Great, thank you for the help and a quick response! :)
Dear, i want a clarification from you. I have a 5 Independent variables which are sub-classified in to two,three and four.And i have around 34 dependent variables.These independent variables are classified on 5 Likert scale,which include Strongly Agree, Agree,Neutral,Disagree and Strongly.So, i want to analyse the differences among groups by using infrential statistics.So, which test is best to use?
Note:My questionnaire were distributed randomly.
That would depend on the type of variables, i.e. whether they are nominal, ordinal, or scale, the questions you want answered, the number of values per variable, and -possibly- the number of responses you’ve got. It’s really hard to tell without knowing more about your research project and your data. If you’re in a supervised study programme, your advisor would be better placed to answer this question.
Interesting advice thank you
I have one question …during mean calculation for variables, let say it is 3.00. so can we put an interpretation based on it….3 is neutral or undecided
Ideally, you should have a verbal descriptor to go with each number. If you don’t, then either I interpretation is kind of shaky. If this is an aggregate of many responses, I’d venture towards ‘neutral’
Which statistical tool should I use in SPSS to find whether there is relation between variables, if the 1 and only dependent variable is Likert-Scale and independent variables are categorical(6 variables) ?
Hi! I just want to ask, after summarising the data, what type of variable would it be? Scale or is it still ordinal?
Can I use it to do a Spearman correlation analysis and do I need to summarise the other categorical data?
Hi! Strictly speaking, it’s still an ordinal variable but some people argue that the data behave as if it’s an ‘interval’ scale. This is a special kind of scale, where the possible values have fixed distances. If you make that assumption for both variables, it should be possible to run a correlation analysis. Hope that helps.
Thanks Achilleas, that helped a lot. I now know how to proceed. Thanks. Appreciate the work you do!
Good day, i have a questionnaire with both agreement statements i.e. (1 = Strongly Agree, 2 = Agree, 3 = Undecided, 4 = Disagree, 5 = Strongly Disagree) and with frequency statements i,e (1: Never to 5: Always). How do I summarise the data
You don’t. These statements do not seem to measure the same construct.
The pre- and post-Likert scale data that I am working with was administered in 2017, 2018, and 2019. What is the most effective way to run internal consistency? Would I run internal consistency on pre and post one year at a time? Or one report that includes all three years for pre and all three years for post?
Hi, and thanks for reaching out. It really depends on what you’re trying to do with your data, i.e., what your research questions are. If you’re just trying to show that the scales are internally consistent, I think that aggregating all the data and running a Cronbach alpha test should be OK.
I like this post, it’s simple and easy to understand!
I imagine it would, as long as the items you summarise are measuring the same thing, more or less, i.e., the scale is internally consistent