Presenting multilingual data: Some options (Achilleas Kostoulas)

In the previous post of the Researching Multilingually series, I discussed some considerations that impacted the representation of multilingual data. In this post, I follow up on those considerations, by presenting four options that can be used to present multilingual data in a research report. These options, which can be thought of as a ‘cline of representational positions’, are presented in Figure 1.

Figure 1. A cline of representational positions

Presenting verbatim data

Presenting verbatim data along with their translation, as I have done in Figure 2, is the most transparent of the four options (although it one should always remember that the transcribed data are already a reduced form of what was actually communicated). In addition to the fairly obvious fact that such a representational option promotes the visibility of languages other than English in scholarly communication, one of its main advantages is that it allows bilingual readers to independently engage with the data in their original form. This creates a more visible ‘audit trail’, which helps to generate confidence in the findings; plus, it allows readers to re-interpret the data and generate new insights.

Figure 2. Verbatim bilingual presentation

Secondly, a bilingual presentation can highlight theoretically significant aspects of form that a translation would mask. In the extract presented in Figure 2, for instance, the presentation of Greek extract draws attention to the fact that the interviewee was making extensive use of English technical vocabulary (highlighted in red). This might be significant, because it offers insights into how she constructed her professional identity and because it shows how English was encroaching into the discourse domain of professional communication in my research setting.

In spite of the, a verbatim bilingual presentation may not always be desirable. First and foremost, the dissemination outlet may not be able to typographically support it, or (more commonly) they may not be willing to offer this option, because of the higher printing costs. Secondly, one needs to consider whether the word-space used for bilingual representation is at the expense of the argument one is trying to make. In a large scale project such as a thesis, this might not seem like an important consideration (and in my case, I was able to negotiate a 10% increase to the word limit in order to cater for multilingual data). However, in academic journals, where space is at a premium, balancing rich qualitative data and interpretation is often a challenge, even if you don’t have to consider bilingual data.

Presenting standardised data

The second option in the cline is presenting standardised data. This is fairly similar to what was described above, except that the spelling irregularities and non-standard forms are subtly changed to conform to the standard variety, as shown in Figure 3. One reason for doing this is to avoid stigmatising the research participants: in an article evocatively titled ‘Ritin folklower daun ‘rong, Dennis Preston convincingly argues that a transcription that is too faithful constitutes a misrepresentation of what was said*. In an earlier post in this series, I also argued that in my own thesis, presenting raw data, warts and all, risked harming my research participants, and had to be avoided on ethical grounds.

Figure 3. Standardised bilingual data

In cases where a standardisation is necessary, there are two important caveats. First, standardisation needs to be done after data analysis, in order to avoid compromising the integrity of the dataset. Secondly, researchers should explicitly account for (a) why standardisation was desirable; (b) which ‘standard’ was used, and why; and (c) how the dataset changed as a result of this intervention.

Presenting unabridged data

The third option takes us to into the monolingual representation: if a dataset contains similar information in several languages, researchers may pick out a typical piece of data as representative of the whole (Figure 4). Selecting data in English to symbolically represent the entire dataset is a fairly pragmatic solution, which completely eschews the challenges of translation and the dilemmas associated with bilingual representation.

Figure 4. Unabridged monolingual daa

While the simplicity of this option makes it fairly attractive, its overuse can lead to ‘silencing’ the non-English data. This is politically problematic, as it makes an ideological statement (however unintended) about the primacy of the English language (Roberts 1997: 170). It may also be epistemologically problematic, if there are subtle differences between the English and non-English data. In my research, for example, Greek tended to be used by students who were less successful in learning English and/or had more negative attitudes towards the language. In other words, language choice was associated with subtle differences in content as well. While I think I was sufficiently alert to such differentiations, it is conceivable that a systematic bias for English data might mask such differentiation.

To mitigate against such risks, it is helpful for researchers to reflect on how the typical extracts are selected. One strategy that I found helpful involved using multiple more-or-less similar data extracts in my text, and comparing their rhetorical effect. I recorded these thoughts in reflexive memos, and though I will admit that, more often than not, my choice of examples was arbitrary, this process helped me to refine my understanding of whatever I was trying to describe.

Presenting summarised data

The final option in the cline involves summarising data in English (Figure 5). Condensing data can be appropriate when emphasis is on content rather than form, as it helps researchers to present information economically, and enhances the readability of the data. In addition, it helps to preserve some measured opacity when a detailed presentation is undesirable.

Figure 5. Summarised data

There are, however, several disadvantages to such a representation strategy. Most importantly, it interjects the researcher between the data and the reader. While such a risk is -arguably- present in all representation strategies, in this case the researcher’s interpretation of the data becomes very prominent, and it is not moderated by access to the original data. In doing so, this strategy risks violating what has been termed the “validity through transparency and access principle” (Nikander 2008: 227). Secondly, the re-voicing of the data risks de-voicing the research participants, which can be epistemologically problematic, and ethically dubious.

The risks mentioned above can be mitigated somewhat by using appropriate research methods. For instance, this might involve having the research participants validate the condensed texts. In addition, a heightened degree of reflexivity might be helpful, as it would allow the researcher to have greater awareness of their own presence in the re-voiced text. Similarly, reflexive statements in the research report might counterbalance the opacity of the data.

A flexible representation strategy

It should be obvious, from the discussion above, that all the representation options offer particular affordances, but are also associated with different risks. What I think that this suggests is that each option in the cline is better suited for different instances of data, or (conversely) that it may be possible to flexibly eclectically combine more than one options in the same writing project. This should not be taken as a warrant for opportunistic ad hocery; rather, what I wish to suggest is that researchers reflect on the range of possible options for each instantiation of data, and make an informed decision on what representation option is best suited to it.

Notes

1) This post, and the one that preceded it, are based on a presentation I gave at the Researching Multilingually Seminar in Manchester on 22-23 May 2012 (of which more in the following note). The presentation slides can be viewed below:

2) This post concludes the Researching Multilingually series (for the time being, at least). The name of this series of posts is derived from a seminal research project undertaken by Jane Andrews (University of West England), Mariam Attia (Durham University), Richard Fay (University of Manchester) and Prue Holmes (Durham University). The project website contains lots of information about doing multilingual research, as well as a very useful collection of references on multilingual research methodology.

3) The full reference for Preston’s article is: Preston, D. (1982). ‘Ritin folklower daun ‘rong. Journal of American Folklore 95, 304-326.

4) The Featured Image is from The Leaf Project @ Flickr, and it is made available under a Creative Commons Attribution & Share Alike (CC BY-SA 2.0) licence.