Although I am not a statistician, through some quirk of Google’s search algorithm, it appears that I have become promoted to a go-to internet expert on Likert scales. This is sometimes awkward, especially when a less-than-perfect blog post is cited in a peer-reviewed publication, but I can live with that. On the other hand, I tend to be somewhat more frustrated when my writings are misunderstood and misquoted – and the purpose of this post is to set the record straight after one of these instances.
It was recently brought to my attention that my views on Likert scaling have been cited by Dr Carolyn J. Hamblin in her PhD thesis (or dissertation, to go by US usage). In the methodology chapter, Hamblin states that “[s]ome scholars, such Kostoulas (2015), asserted that any numerical calculation applied to the data [produced by Likert scales] are [sic] invalid in all cases.” (p. 57). After a “comparison of medians and interquartile ranges (Kostoulas, 2015) with means and standard deviations” (p. 58), Hamblin concludes that it’s quite safe to ignore my recommendations, since her calculations (mean and standard deviation) produced similar results with mine (median and interquartile range) most of the time.
Before engaging with Hamblin’s argument in a more substantive manner, I want to correct a minor point. The in-text citations to Kostoulas (2015) are, as far as I can tell, in reference to two distinct blog posts written in 2013 and 2014. Of these, only one in listed in the bibliography, with an incorrect date and URL.
Moving on to a less trivial issue: I never stated that Likert scale data cannot be subjected to any kind of numerical calculation. I have emphatically claimed that “ordinal data cannot yield mean values”, which is, I should think, an uncontroversial thing to say. I have stated that, in my opinion, Likert-type items produce ordinal data, but I have also written that Likert scales (which are composites of several items) allow for more flexibility. Elsewhere, I have explained that:
Some very well-designed Likert scales can, indeed, produce data that are suitable for calculating means, or running statistical tests that rely on the mean. These scales are the product of careful weighting and extensive testing across large numbers of respondents.
In all, I think that the selective presentation of my writings in Hamblin’s thesis does little justice to either my views or her research.
This is not the only instance where Hamblin is being disingenuous. Further in the same paragraph, she writes that: “Grace-Markin (2008) argued that under certain circumstances numerical [I think she means “parametric”] calculations are acceptable. The scale should be at least 5 points, which is what this survey used.” Readers may want to read this statement against what Grace-Markin actually recommends:
At the very least, insist that the item have at least 5 points (7 is better), that the underlying concept be continuous, and that there be some indication that the intervals between points are approximately equal. Make sure the other assumptions (normality & equal variance of residuals, etc.) be met.
That is to say, Grace-Markin suggests that the data produced by Likert scales can be used in parametric calculations, as long as at least five criteria are met (multitude and equidistance of points, construct validity, normality and equal variance of data). Of these, Hamblin ignores the final four and re-interprets the one that remains to fit her research.
So, what is one to do when they find out that their work has been distorted through careless reading and ‘refuted’ though selective and creative recourse to the literature? At minimum, one can always repeat Alan Greenspan’s quote: “I know you think you understand what you thought I said, but I’m not sure you realise that what you heard is not what I meant”. In addition to that, one feels compelled to register profound frustration at the variability of what is considered to be doctoral work across the world.
Featured Image by Michael Kwan [CC BY-NC-ND]