Presenting multilingual data: some options

In the previous post of the Researching Multilingually series, I discussed some considerations that impacted the representation of multilingual data. In this post, I follow up on those considerations, by presenting four options that can be used to present multilingual data in a research report. These options, which can be thought of as a ‘cline of representational positions’, are presented in Figure 1.

Slide14Figure 1. A cline of representational positions

Presenting verbatim data

Presenting verbatim data along with their translation, as I have done in Figure 2, is the most transparent of the four options (although it one should always remember that the transcribed data are already a reduced form of what was actually communicated). In addition to the fairly obvious fact that such a representational option promotes the visibility of languages other than English in scholarly communication, one of its main advantages is that it allows bilingual readers to independently engage with the data in their original form. This creates a more visible ‘audit trail’, which helps to generate confidence in the findings; plus, it allows readers to re-interpret the data and generate new insights.

Slide16Figure 2. Verbatim bilingual presentation

Secondly, a bilingual presentation can highlight theoretically significant aspects of form that a translation would mask. In the extract presented in Figure 2, for instance, the presentation of Greek extract draws attention to the fact that the interviewee was making extensive use of English technical vocabulary (highlighted in red). This might be significant, because it offers insights into how she constructed her professional identity and because it shows how English was encroaching into the discourse domain of professional communication in my research setting.

In spite of the, a verbatim bilingual presentation may not always be desirable. First and foremost, the dissemination outlet may not be able to typographically support it, or (more commonly) they may not be willing to offer this option, because of the higher printing costs. Secondly, one needs to consider whether the word-space used for bilingual representation is at the expense of the argument one is trying to make. In a large scale project such as a thesis, this might not seem like an important consideration (and in my case, I was able to negotiate a 10% increase to the word limit in order to cater for multilingual data). However, in academic journals, where space is at a premium, balancing rich qualitative data and interpretation is often a challenge, even if you don’t have to consider bilingual data.

Presenting standardised data

The second option in the cline is presenting standardised data. This is fairly similar to what was described above, except that the spelling irregularities and non-standard forms are subtly changed to conform to the standard variety, as shown in Figure 3. One reason for doing this might be to avoid stigmatising the research participants: in an article evocatively titled ‘Ritin folklower daun ‘rong, Dennis Preston convincingly argues that a transcription that is too faithful constitutes a misrepresentation of what was said*. In an earlier post in this series, I also argued that in my own thesis, presenting raw data, warts and all, risked harming my research participants, and had to be avoided on ethical grounds.

Slide18Figure 3. Standardised bilingual data

In cases where a standardisation is necessary, there are two important caveats. First, standardisation needs to be done after data analysis, in order to avoid compromising the integrity of the dataset. Secondly, researchers should explicitly account for (a) why standardisation was desirable; (b) which ‘standard’ was used, and why; and (c) how the dataset changed as a result of this intervention.

Presenting unabridged data

The third option takes us to into the monolingual representation: if a dataset contains similar information in several languages, researchers may pick out a typical piece of data as representative of the whole (Figure 4). Selecting data in English to symbolically represent the entire dataset is a fairly pragmatic solution, which completely eschews the challenges of translation and the dilemmas associated with bilingual representation.

Slide20Figure 4. Unabridged monolingual daa

While the simplicity of this option makes it fairly attractive, its overuse can lead to ‘silencing’ the non-English data. This is politically problematic, as it makes an ideological statement (however unintended) about the primacy of the English language (Roberts 1997: 170). It may also be epistemologically problematic, if there are subtle differences between the English and non-English data. In my research, for example, Greek tended to be used by students who were less successful in learning English and/or had more negative attitudes towards the language. In other words, language choice was associated with subtle differences in content as well. While I think I was sufficiently alert to such differentiations, it is conceivable that a systematic bias for English data might mask such differentiation.

To mitigate against such risks, it is helpful for researchers to reflect on how the typical extracts are selected. One strategy that I found helpful involved using multiple more-or-less similar data extracts in my text, and comparing their rhetorical effect. I recorded these thoughts in reflexive memos, and though I will admit that, more often than not, my choice of examples was arbitrary, this process helped me to refine my understanding of whatever I was trying to describe.

Presenting summarised data

The final option in the cline involves summarising data in English (Figure 5). Condensing data can be appropriate when emphasis is on content rather than form, as it helps researchers to present information economically, and enhances the readability of the data. In addition, it helps to preserve some measured opacity when a detailed presentation is undesirable.

Slide22Figure 5. Summarised data

There are, however, several disadvantages to such a representation strategy. Most importantly, it interjects the researcher between the data and the reader. While such a risk is -arguably- present in all representation strategies, in this case the researcher’s interpretation of the data becomes very prominent, and it is not moderated by access to the original data. In doing so, this strategy risks violating what has been termed the validity through transparency and access principle” (Nikander 2008: 227). Secondly, the re-voicing of the data risks de-voicing the research participants, which can be epistemologically problematic, and ethically dubious.

The risks mentioned above can be mitigated somewhat by using appropriate research methods. For instance, this might involve having the research participants validate the condensed texts. In addition, a heightened degree of reflexivity might be helpful, as it would allow the researcher to have greater awareness of their own presence in the re-voiced text. Similarly, reflexive statements in the research report might counterbalance the opacity of the data.

A flexible representation strategy

It should be obvious, from the discussion above, that all the representation options offer particular affordances, but are also associated with different risks. What I think that this suggests is that each option in the cline is better suited for different instances of data, or (conversely) that it may be possible to flexibly eclectically combine more than one options in the same writing project. This should not be taken as a warrant for opportunistic ad hocery; rather, what I wish to suggest is that researchers reflect on the range of possible options for each instantiation of data, and make an informed decision on what representation option is best suited to it.


This post, and the one that preceded it, are based on a presentation I gave at the Researching Multilingually Seminar in Manchester on 22-23 May 2012

The name of this series of posts is derived from a seminal research project undertaken by Jane Andrews (University of West England), Mariam Attia (Durham University), Richard Fay (University of Manchester) and Prue Holmes (Durham  University). The project website contains lots of information about doing multilingual research, as well as a very useful collection of references on multilingual research methodology.

3) The full reference for Preston’s article is: Preston, D. (1982). ‘Ritin folklower daun ‘rong. Journal of American Folklore(95), 304-326.

4) The Featured Image is from The Leaf Project @ Flickr, and it is made available under a Creative Commons Attribution & Share Alike (CC BY-SA 2.0) licence.

Thinking about how to present multilingual data

So far in this series of posts on doing multilingual research, I have probed the intricacies of multilingual research settings, presented some dilemmas about obtaining informed consent, and talked about language choice and data generation. This post moves the discussion forward to the most visible outcome of a multilingual research project: How might one present multilingual data on a research report or article?

The short answer is: it depends. Rather than suggest a representational strategy that must always be used, I would argue that the representational choices one makes should be informed by reflection about the specifics of one’s research project. In the paragraphs that follow, I will illustrate such a process of refection, by drawing on my own PhD study.

The challenge

My research was a case study of a ELT language school in Greece, so in the process of doing fieldwork, I collected data in both Greek (the mother language of most participants) and English (the working language of the school). As I was proficient in both languages, I was able to analyse the data without translating them. However, in the interim reports that I sent to the University, and in my thesis, these data had to be translated to English, for obvious pragmatic reasons. From this stipulation, a number of questions emerged, such as:

  • How literal should my translation be? Literal translations sounded stilted, but taking too many liberties in translation would add a layer of opacity to the data.
  • Should the original be presented alongside the translation? What advantages would such a representational choice offer? Might there be any hidden complications involved?
  • If I decided to present both translation and the originals, how would the texts relate to each other? Would the original versions of the extract be consigned to an appendix? Or should the appendix house the translations? Should the texts be presented side by side, and if so, which language should be presented first?

Developing a representational strategy

I first thought that it would be possible to solve these problems by consulting the University regulations, or the APA style manual, or maybe by referring to the literature in order to find out what standard practice was. These strategies proved less than helpful. Although my study was by no means unique, and perhaps not uncommon, there seemed to be little guidance available. More importantly, the proposed strategies that seemed to work well for some parts of my dataset, seemed less suitable for others.

Soon enough, I decided to develop a tailor-made representational strategy, by going back to first principles. This involved finding a way to reconcile four sets of considerations: political, theoretical, practical and ethical.

Political considerations: countering linguistic hegemony

There is a vibrant discourse in the literature regarding the ways in which the global spread of English is threatening the viability of other languages, and this discourse seems to have extended to the role of English as an academic lingua franca. The need to promote the visibility of languages other than English in academic discourse, as a means of counteracting this trend, is -I think- uncontroversial. For me, this meant that it was important to preserve the Greek data in some form in my thesis.

But this led to another question, namely, which variety of Greek should I make? Sometimes, my research participants would use pronunciations and lexis that were particular to North-Western Greece, as a way of indexing our shared heritage, and in order to show that I was accepted as a member of their in-group (perhaps also as a subtle reminder that I was expected to behave as ‘one of them’?). While I would have liked to faithfully transcribe such linguistic cues, I was also conscious that Modern Greek is highly standardised, and that regional varieties tend to be associated with lack of education. I did not want to risk stigmatising participants by making them ‘sound like villagers’, but I also had to reflect on whose standards were being enforced by my representational choices.

Theoretical considerations: preserving the participants’ voices

A second set of considerations related to the way I understood language. From the theoretical perspective that informed my study, language was not a ‘thing’ that exists independent of context. Rather, it emerges from local situations, it is influenced by dynamics of interaction that are particular to those settings, and it is intended to cause specific effects within them .

When a text is picked out of this context, and the ‘voice’ of the original is replaced by that of the translator, all the above changes. A successful translation, my reasoning was, should perform a similar function to the original in a different ecology, so (paradoxically?) it should be different from the original. In a sense, this is similar to the way that a prosthetic limb is functionally similar to what it replaces, but it need not be an exact replica down to the level of blood vessels, sinews or hair follicles.

It followed that it would be epistemologically problematic to deny my readers’ access to the participants’ voices. But it also meant that I could not expect to create ‘perfectly accurate’ translations. Moreover, I began to realise that the goal should be functional equivalence.

Practical considerations: the challenges of translation

Added to the lofty considerations of politics and epistemology, I had to consider was the nitty-gritty of translation methods. Far from being accurate renditions of the originals, translations are selective and contingent texts, and I would like to illustrate this by reference to an example.

Figure 1. Literal and functional translations

Figure 1 shows a description, by a student, of a typical lesson (top). On the left-hand column, I have presented a literal translation, and the right hand column presents my rendition, as it appears in one of my interim reports. Stylistics aside, in the space of these few lines I had to make several interpretative decisions. For instance, the phrase ‘we say the lesson’ («λέμε το μάθημα») is semantically ambiguous: taken in isolation it could refer to an oral examination (i.e., ‘we are asked to recount what we were taught in the [previous] lesson’) or to a presentation of new material (as in: “we go over the next few pages in the coursebook”).

In this instance, I was confident in choosing the latter interpretation because of the co-text. This interpretation was consistent with my knowledge of the research setting, and my experience as a person who was educated and has taught in Greece. Still, this was a personal interpretation, and it was important to give readers the opportunity to make alternative interpretations, or at least to help them understand where my interpretation was derived from. To that end, I felt it would be helpful to present original forms alongside the translations.

Ethical considerations, or ‘do no harm’

The last set of considerations that weighed on my mind pertained the principle of non-malfeasance (‘first, do no harm’). In the previous post, I already hinted at the the implications of how the English language teachers were portrayed, when they used non-standard forms. The same line of reasoning could be extended to students, as seen below:

Figure 2. Unedited vs. standardised data

Figure 2 shows a response given, in Greek, by a learner who seemed unenthusiastic about language learning. The student used a phonetic spelling of the word “βαριέμαι” (: I’m bored), which violates the orthographic expectations of Modern Greek in a somewhat amusing way (top). It would be possible to render this answer accurately, and attempt to recreate the spelling mistake in the translation (middle). However, the analytical advantages of such a choice seem to be outweighed by the risk of stigmatising the student. In such a case perhaps a standardised rendition of his answer (bottom) would be a better option.

So, what is one to do?

Taken together, all the considerations above posed a number of challenges for representation. Some considerations seemed to privilege maximum transparency, whereas others seemed to suggest a need for measured opacity. In the the next blog post in this series, I will focus on some possible solutions, and describe a ‘cline of representational positions’ which can be used as a framework for developing a representational strategy.

Featured Image Credit: Pixabay (Public Domain). Note: this post draws on a paper I presented in the Researching Multilingually seminar in Manchester in April 2012.

Generating data in a multilingual research environment

In the two previous instalments to the Researching Multilingually series, I asked a number of hypothetical questions that researchers in multilingual settings might face, and discussed some challenges involved in obtaining informed consent. This post moves the discussion forward, by looking into questions of data generation. Once again, I will approach this topic by drawing on the experience of my own PhD research, a case study set in a language school in Greece, where English was used as a working language.

When thinking of how to elicit data from the teachers and students in the language school, I originally planned to use English as my default language. This was in line with the school’s monolingual policy, which encouraged the use of English in the premises at all times, for pedagogical reasons (and, I suppose, for reasons of prestige as well). I tacitly acknowledged that some students might struggle with expressing complex concepts in English, and I expected that they would perhaps resort to using Greek. However, my overall attitude towards Greek was one of tolerance rather than encouragement. I certainly had not imagined any problems with using English when interviewing the teachers. I was wrong.

Dealing with non-standard forms

The first problem with my monolingual data generation strategy was the frequency of non-standard data in the interviews. While transcribing the interviews, I was surprised to notice that the participants, who were non-native speakers of English tended to use non-standard forms, false-starts, repetitions and other language infelicities rather frequently. Here’s an extract from one of the interviews, where a young, rather inexperienced and very nervous teacher described how she taught grammar:

…and then ask my students to give me examples. For example (2 sec) I when I teached the passive voice, I gave them the rule, and I told them ‘try to give me an example’. But this didn’t work with everybody because they feel the pressure that ‘I have to use the foreign language’ and they couldn’t do this.

While occasional slips such as *teached are to be expected in oral discourse, especially in when a speaker is stressed, I had not realised how frequent they might be. Moreover, they created two serious ethical problems:

  • Firstly, I felt that when I eventually shared the transcripts with the participants, such deviations from ‘correct’ English would likely undermine their confidence in the ability to use the language. Many people report that the first time they hear their recorded voice, they feel uncomfortable with how they sound. I was very apprehensive that the same thing would happen when I confronted teachers with a record of their oral discourse, especially since their professional identity depended on them using English ‘correctly’.
  • Furthermore, I was concerned that if the transcripts were to be made more broadly available, that could result in unfair judgements being made about the teachers’ competence, or the quality of second language education provided in the language school. It wasn’t unimaginable that a professional competitor might use my data to disparage the language school, and it was also quite plausible that if an unfavourable reputation was created, however unjustly, the teachers would find find it difficult to find employment elsewhere in the future.

I eventually solved that problem by creative improvisation: I began using several versions of interim transcripts, with different degrees of detail and editing, for purposes such as validation, analysis and dissemination. But I decided that in the next rounds of interviews, I would have to be more flexible with language use.

Dealing with nervousness

Another problem associated with the monolingual data generation strategy was that it tended to make interactions too awkward. By repeatedly listening to the recordings, I began to notice that a number of participants sounded distinctly nervous. In part, I think that this was due to a perception of power differentials between me (an ‘expert’) and the teacher participants, who may have felt that their professional practice was under scrutiny. However, it seems very plausible that these feelings were exacerbated by the fact that we were using English, as I was thought to be a more fluent speaker than they were.

To confirm this tentative hypothesis, I asked some teachers for feedback on the experience of being interviewed in English. Perhaps unsurprisingly, none admitted having experienced any discomfort. However, a number of participants suggested that the opportunity to practice English in the context of a sophisticated conversation was excellent practice for them, which (I think) suggests that the interviews did take them outside their comfort zone.

A revised data generation strategy

To mitigate these problems in subsequent interviews, I began to actively encourage the use of Modern Greek. In practical terms this meant that I would begin a pre-interview phase in Greek, and before starting the main interview I would ask the participating teacher if she would like us to speak in Greek or in English. Here’s an extract, showing how language choice was negotiated:



Teacher: Τώρα [γράφει. Now [it’s recording
Achilleas: [°τώρα γράφει.° Πήγα να βάλω το μικρόφωνο στη θήκη που μπαίνουν  τα: τέτοια τα ακουστικά. [(softly) now it’s recording. (normal voice) I almost put the mike in the plug where the: whatyoucallthem, the earphones go.
Teacher: A:: και ‘συ να νομίζεις ότι [γράφει? και αυτό δε γράψει τίποτα Ah:: and you’d think that it [records? but it records nothing
Achilleas: [και να νομίζω ότι γράφει και αυτό να μην [and I’d think it records and it doesn’t


Achilleas: Πάντα backup. E::: Ελληνικά ή Αγγλικά; Όπως [θες. Always backup. Erm::: Greek or English? Whatever [you like
Teacher:                                                      [Α δε με νοιάζει.                          [oh, I don’t mind
Achilleas: °Όπως [θες°. (softly) As you [wish
Teacher:            [>Περίμενε, άμα ξεκινήσεις εσύ να μιλάς Αγγλικά θα γυρίσω κι εγώ να μιλάω Αγγλικά γιατί αλλιώς ντρέπομαι <            [(fast)hold on, if you start speaking in English, I’ll switch to English too, because it’s embarrassing otherwise
Achilleas: Θα γυρίσω εγώ στα Αγγλικά. I’ll switch to English
Teacher: Ναι, >άμα το γυρίσ [(inaudible)< Yes, (fast)if you [swi(inaudible)
Achilleas:  [OK. Fair enough. Is there anything else you’d like to know about, the interview before we begin?
Teacher: >No, we can begin.<
Achilleas: Excellent? Do you feel nervous?
Teacher: ­YES @@

Unexpected findings

These interviews provided me with a number of unexpected insights. Firstly, I noticed that the teachers tended to code-switch very frequently in the Greek interviews, especially when using professional terminology (Figure 1).

Figure 1. English metalanguage in Greek discourse

Although some sources on Greek ELT will argue otherwise, such metalanguage is available in Modern Greek, so the difficulty the participants experienced in accessing it hints at the Anglo-centric orientation of their teacher education, and the disconnect between the ELT teacher-training programmes and mainstream Greek pedagogy. It also offers clues about the ways in which the teachers at the language school constructed a professional identity, by displaying their mastery of the profession’s metalanguage.

Another finding that surprised me even more was that I sounded considerably less confident in the Modern Greek interviews. The number of repetitions, and false starts In my discourse was considerably higher than what it had been in the English ones, even though one would have expected my my interviewing skills to increase over time. My provisional explanation is that I seem to have been subconsciously privileging the use of English, in which I was more proficient than other participants, in order to compensate for deficiencies in my interpersonal skills. I was using English, in other words, as an instrument of power, and I wasn’t even aware of it.


In summary, when faced with the task of interviewing teachers, I had the option of using either English or Greek. My language of choice was English, and I justified this choice by drawing on the school’s monolingual policy. Conducting the interviews in English, however, reinforced power imbalances between myself and the other participants, and also generated a number of thorny issues which I was unsure how to confront. A more flexible data generation policy, which gave the participants language choice proved to be better option.

In the next two posts in this series, I will address various considerations relating to how bilingual data might be presented, and I shall conclude this series by presenting a cline of representational options. Till next week!

Image Credit: The LEAF Project @ Flickr | CC BY-SA 2.0

Obtaining informed consent in a multilingual research setting

In the previous post of the Researching Multilingually series, I probed the complexities involved in working in a multilingual research environment. In this post I begin to engage with the practical problems of such research projects, by drawing on the experience from my PhD study, to illustrate the challenges associated with obtaining consent.

Three linguistic dilemmas

My PhD study was set in a language school in Greece, where English was taught to young learners, most of whom (though not all) had Greek cultural backgrounds. Although it would have been possible to explain the purpose of the study to the students in Greek (a language in which they were all fluent), the school had a rigid policy of using only English in the premises. On the other hand, it was, therefore, far from certain that an explanation in English would be completely understood by everyone.

As the students were underage, I also had to obtain consent from their parents. Per university policy, this had to be done in writing, and Greek seemed to be the default option. However, this decision was complicated by the fact that a substantial number of students of the learners were born to immigrant parents, mostly from Albania. Some of the parents belonged to the Greek-speaking minority in Albania, and most had acquired Greek, but it was unclear how well any of them could cope with the written form of the language or with the formal register I was using in my consent forms. Ideally, I would have liked to create consent forms in Albanian, despite the translation cost involved. However, given the somewhat insensitive attitudes present in parts of Greek society, I was apprehensive that many students would be reluctant to ask for a non-Greek document, and some Greek-speaking parents might even be offended by being addressed in Albanian.

The third group of participants I had to obtain consent from were the teachers. All the teachers in the language school were Greek, so Greek seemed like a sensible choice, but the language with which they tended to associate their professional identity was English. I therefore felt that if I addressed them in Greek, I might be perceived as patronising, or as questioning their professional competence.

These considerations are summarised in Table 1, below.

Participant group Greek English Albanian
Students + universally understood- against the school’s monolingual policy + pedagogically beneficial?- informed consent not guaranteed
Parents – marginalises
linguistic minorities- informed consent not universally guaranteed
+ respects linguistic minorities- translation cost

– singles out users of different languages (stigmatisation?)

Teachers +universally understood +respects professional identity

Table 1. Summary of linguistic considerations about eliciting consent

Developing a linguistic strategy

I can’t say that I eventually came up with a perfect solution to these dilemmas, but the strategy I used was highly pragmatic.

In the case of learners, I visited their classes and talked to them, using English appropriate to their linguistic learners. Then, building on my teaching background, I used elicitation techniques to confirm comprehension, and provide clarifications where necessary. I did not elicit consent orally at that stage, as I thought that students may be reluctant to refuse, but I included a consent section (in Greek and English) in the first page of the bilingual questionnaires that were distributed to the students (Figures 1 and 2).


Figure 1. Greek version of the consent sectionS_questionnaire_2_Page_03

Figure 2. English version of the consent section

The parents of underage learners were provided with a letter and an opt-out form written in Modern Greek, on the (somewhat arbitrary) assumption that most of them would be able to read in that language, and their children would be able to convey its gist to those who couldn’t.

My teacher participants, who were competent users of English, were provided with letters describing the research project and consent forms in English.

The rule of thumb was that English (the working language of the school where my research was embedded) would be treated as the default, from which I would deviate only to the extent necessary. I did this for two reasons: First, I wanted to comply with the school’s policy, which required maximising exposure to English on pedagogical grounds. I also felt that the use of English helped to increase transparency and accountability to the University, since neither my supervisors nor our review board would be able to engage with documentation in Modern Greek.

Unexpected complications

It is said that no research plan survives contact with the field. When I started eliciting consent, two unexpected complications came up.

Firstly, a number of teachers reported that the English in the information documents was ‘too good’, and that it made them realise ‘how bad their English was’. At the time, I dismissed the comment assuming it was a compliment. In retrospect, I have come to realise that they were trying to communicate that the sophisticated language in the documents was intimidating them. To be honest, I am still not sure whether Greek would have been a better option, or maybe a more informal register in English would have been a workable compromise.

The other complication related to the equivalence of translations between the documents I was using, and the ones I was submitting to my supervisors for approval. The Greek versions of the documents were written in a bureaucratic register that is often used in communication between schools and parents (e.g., ‘I consent to the participation of my child in this survey’). At the time, it was suggested to me that such formal language sounded too formal in English, and that a simpler register might be less off-putting. In response to this feedback, I piloting different versions of the document, and discovered that parents tended to perceive the ‘official’ version as being more ‘authoritative’ and ‘serious’. The experience helped me to realise that switching between languages often involved searching for functional equivalents, i.e., wording that has the same intended effect on the readers,  rather than direct translations. This was an issue that would recur frequently in the study.


This post looked into some of the considerations and complications of obtaining consent to research in a multilingual setting. The next posts in this series will discuss data generation, and problems and solutions for data representation. Stay tuned!

Image Credit: Roland Tanglao @ Flickr | CC BY 2.0

Researching multilingually: More than meets the eye?

When I started planning for the study that was eventually reported in my PhD thesis, I spent considerable time thinking about suitable participants, appropriate research methods, ways to analyse and synthesise data. Perhaps surprisingly, given that I was working in the qualitative tradition, and my data was (almost) all language, the linguistic intricacies of doing research in a English language school in Greece, and then reporting this research back in English didn’t occur to me at that point.

This series of posts on Researching Multilingually is a retrospective account of the challenges I faced as I conducted the study. It will consist of four posts (in addition to this introduction), which will focus on:

  1. The dilemmas associated with obtaining consent;
  2. Challenges involving data generation;
  3. Considerations that impacted data representation; and
  4. A cline of representational positions for reporting research.

