When TEYL studies don't agree (Achilleas Kostoulas)

Teaching English to Very Young Learners (TEYL) [1] is a recurring topic in this blog, not least because of the intese pressure to increase the provision of ELT in primary education, and to start English classes at ever younger ages. Sadly, this pressure is not always well supported by empirical evidence. In fact, a lot of research on TEYL is, at best, contradictory. This post looks into one such case of misalignment between studies, discuses possible reasons for discrepancies in the findings, and suggests ways forward.

In this post

Introduction
Reasons why TEYL research is often inconclusive
How can we attain greater certainty in TEYL research
Notes

Introduction

A key feature of good science, at least if you happen to adhere to a positivist world-view, is its replicability. Simply put, it means that if our methods are sound, other researchers working on the same question should reach the same (or at least very similar) results. In the social sciences, however, perfect agreement between studies is rare, sometimes suspicious, and at the very least mundane. It seems, in fact, that the most interesting aspects of overlapping studies are the points where they differ.

In social sciences research, perfect agreement between studies is rare, sometimes suspicious, and at the very least mundane.
Tweet

It was such a clash of findings that prompted this note. A colleague and I recently presented a conference paper, in which we talked about the pilot phase of a Teaching English for Very Young Learners after-school programme. In our presentation, we noted -among other things- that pupils and parents in the school we studied were sceptical about introducing foreign language teaching in the early education curriculum. Figure 1 (below), shows that most participants in our study believed it was more appropriate to introduce English lessons in the 3rd Form or later, i.e., when children were at least 9 years old.

Bar chart showing distribution of responses — Fig. 1 – Responses to “When should English language lessons be introduced?”

In the discussion that followed, Prof. Thomais Alexiou (University of Thessaloniki) was kind enough to share unpublished findings from a much larger project in which she was involved, which seem to be at variance with our conclusions. In their study, which evaluated the effectiveness of PEAP, a TEYL programme involving 800 schools across Greece, they found that parents were initially ambivalent towards such programmes, but developed positive attitudes later on.

As I said above, such discrepancies are not uncommon, and they need not be seen as a threat to the validity of either study. Rather, by taking these differences into consideration, and by attempting to account for them, we should be able to end up with more robust interpretations, and this is what I will try to do in the present post.

Reasons why research about Teaching English to Young Learners is often inconclusive

So, why did our findings differ? Here are four hypothetical explanations about why this discrepancy came to be.

Some projects are just more effective than others

The most obvious explanation that might account for the differences in our findings is that the PEAP project simply managed to influence parental attitudes more than our pilot programme did.

If this is true, it is hardly surprising, given the amount of resources and institutional support available to them. The fact that, at the time, the PEAP project drews on the accumulated experience of three years of implementation also seems relevant.

If this is indeed the case, it may prove fruitful to further investigate which features of the PEAP project contributed most to their success, and build upon them.

Some studies are more sensitive than others

A second interpretation, which complements the one above, is that any TEYL programme that is implemented well can counter parental scepticism, but maybe our study somehow masked this effect.

Indeed, there is some evidence in our data that this may be the case. By cross-tabulating parental attitudes against the age of their children, we found that parents of first- and second-form pupils (among whom our pilot project was implemented) tended to hold more positive views towards TEYL, compared to parents of older pupils. Using a standard test of statistical significance (chi-square), we determined that the chances of this distribution being a mere statistical fluke were a little higher than one in twenty (p = .051). Unfortunately, this is slightly over generally accepted thresholds of statistical significance (p < .05), and therefore we felt reluctant to draw conclusions from it. It is possible that the small size of our sample may have skewed these statistics, but short of repeating the study with more participants, there is little one can do to confirm or disprove this hypothesis.

I would argue, however, that if there is merit to this interpretation, i.e., that any well-structured school-based intervention can have an immediate measurable effect on societal attitudes, this is reason for caution, not triumphalism.

In such a case, perhaps there is a need to give more thought to the political and ideological implications of Teaching English to Young Learners. In other words, if we can empirically establish that we can change the way stakeholders think, we need to give some serious thought as to what we are making them believe and why.

Not all projects target the same demographic groups

A third possible reason why our studies on Teaching English to Young Learners produced different findings points towards the different demographics of our target groups.

The PEAP findings have yet to be published in full, so much of the discussion in this section is speculative [2]. However, it stands to reason that their data, taken from 800 schools across Greece, constitute a reasonably representative cross-section of Greek society. This was not the case at the school where we did our research. Although our school was not officially selective, it seemed to attract students from the upper strata of local society. For instance, nearly all our respondents were university graduates, and almost a third of our sample had an advanced studies degree (M-level or PhD). That being the case, it would be very surprising if the knowledge and skills commensurate to such education did not influence the views proffered by our respondents.

Education	Frequency (%)
Secondary education	5 (7,1%)
Four to six years of undergraduate education	41 (58,6%)
Advanced university education (M-level, Doctorate)	23 (32,9%)
Unknown	1 (1,4%)
Total	70 (100%)

Table 1: Sample breakdown according to family education level (i.e., the highest educational attainment in the household)

Clearly, the dynamics between highly-educated parents and the school are not the same as the dynamics in schools that serve less privileged communities. It is at least somewhat plausible that the former were therefore more forthright with their opinions. Social scientists often have to contend with what has been termed the social desirability bias, i.e., the tendency among survey participants to respond in ways that enhance their status, conform to mainstream ideology, and protect their self-image, even at the expense of factual accuracy (Consider, for instance, how you might respond to the question “How many books did you read this year?”).

I would intuitively think that the respondents in our sample may have been relatively less concerned about constructing socially acceptable identities, and they would not feel much pressure to provide us with positive feedback only. Conversely, I would argue that in the PEAP sample, there must have been more than a few respondents who have been accustomed to dealing with authority from a disenfranchised position. It would hardly be surprising if they provided the PEAP researchers with the data they wanted to hear. This seems even likelier if the respondents were concerned, however wrongly, that critical feedback might result in fallout from the school system against their children.

Some studies have uneven levels of quality control

Linked to this, one final hypothesis that might partially explain to the difference between our findings and those of the PEAP study pertains to methodological design. Once again, most of the details in this section are anecdotal, but they will have to do, until more details about PEAP become available to academic scrutiny.

It is my understanding that the PEAP evaluation team used an anonymous questionnaire survey to gather their data. Although questionnaire responses do offer a degree of anonymity, this can be easily compromised. Notably, the questionnaires were administered and collected by the school teachers that participated in the project, and it appears that standard precautions (e.g., sealed envelopes) were not used, presumably in an attempt to cut costs. Moreover, each teacher was responsible for collecting questionnaires from a small number of parents, which could have undermined their feelings of trust. The personal connections and power differentials between survey administrators and respondents may exaggerated the social desirability bias, especially among the participants with low SocioEconomic Status (see above). Unfortunately, it is unclear –for the time being at least– how the PEAP team controlled for any of the above.

Another reason why this arrangement is problematic is because the teachers who administered the survey had a professional stake in the future of the PEAP project, in that their continued employment depended on its success. As far as I know, there were no quality control safeguards in place, and the research team relied on the good faith of the PEAP teachers to prevent research malpractice. Moreover, to the best of my knowledge, none of the teachers who collected the data underwent training in data collection or research ethics. So, the quality of the data collected must have been conditional on the integrity of the 800+ participating teachers. Under the circumstances, it is not inconceivable that at least some teachers may have prioritized their professional future over considerations of academic integrity, and that the PEAP data are –to some unknowable extent– aspirational.

I should stress that there is no way of ascertaining to what extent the social desirability bias and –hopefully infrequent– research malpractice impacted the validity of the PEAP data. However, independent confirmation of the findings would greatly enhance their credibility.

How can we attain more certainty in TEYL research?

In brief, the variance between our findings and those of the PEAP team is intriguing, although it seems impossible to tell whether it is a product of the different methods and goals of our studies, or whether it corresponds to actual differentiations in society at large.

In the paragraphs above, I suggested a number of hypothetical interpretations, which may explain why two similar studies produced different results. It is exactly this difference that fuels scholarship, because it produces interesting questions, and points towards new research directions.

Table 2, below, suggests some possibilities for future research, which productively engage with the tension between findings.

Hypothesis	Suggestion
The PEAP programme is more successful	Identify which aspects contribute to its success
Any well-designed TEYL intervention can change societal attitudes	a) Replicate our study with larger sample b) Problematise why attitudes should be changed and what attitudes are desirable.
Differences in demographics (–> Differentiation of attitudes? Different levels of social desirability bias?)	Consider replicating PEAP evaluation by non-stakeholders.
Effect of quality safeguards on the PEAP study	Consider replicating PEAP evaluation by non-stakeholders.

Table 2: Summary

Reflection on these differences has been very helpful in highlighting the limitations of our own study and move towards a more valid synthesis. I have no doubt that the PEAP team would feel the same.

Notes

1. [^] What exactly ‘very young’ learners are, and how they differ from ‘young learners’ is of course very relative. When this post was written, in 2013, English courses were introduced in the curriculum at the 3rd Form. Children younger than this would be ‘very young learners’. Ten years later, ELT courses are standard across the primary curriculum, and the term ‘very young’ now seems to designate kindergarten children. Rather than attempt to disentangle all this, I follow the convention of using the acronym TEYL to mean both ‘Teaching English to Young Learners’ and ‘Teaching English to Very Young Learners’.

2. [^] Update: You can read summary of their published findings here.

About this post: This post is based on my notes from the 1st Model/Experimental Schools conference in Thessaloniki, and it was written shortly after the event in May 2013. The post was updated in June 2022, at which time it was re-formatted, and the notes were added, but no substantive changes have been made to its content.

Additional content about Teaching English to Young Learners

Children’s Literature in English Language Education Journal

Classroom Observation for language teachers: A how-to guide

Our IATEFL panel: A summary

Teaching English to Very Young Learners in Greece: When research findings don’t align

Teaching English to Very Young Learners in Greece: When research findings don’t align

Introduction

Reasons why research about Teaching English to Young Learners is often inconclusive

Some projects are just more effective than others

Some studies are more sensitive than others

Not all projects target the same demographic groups

Some studies have uneven levels of quality control

How can we attain more certainty in TEYL research?

Notes

Additional content about Teaching English to Young Learners

Share this:

Like this:

Comments

Leave a ReplyCancel reply

Discover more from Achilleas Kostoulas