Continuing the series of posts on how to write more effective questionnaires (previous posts: 1, 2 ), I would now like us to tackle the often-problematic ‘demographic data’ section. The demographics section is where respondents are asked to provide details about their age, the make-up of their family, their parents’ jobs and income, and so on.
What struck me, when I was vetting such questionnaires, was that although most student researchers could speak intelligently about all the items in the surveys they had designed, they seemed unable to provide a rationale for the demographics section. All too often, I was told that this section was there ‘because all questionnaires must have it’. So in this post, I want to go into the purposes that the demographics section serves and how to best use it.
What is the demographics section for?
There are two primary reasons why a researcher might want to collect demographic questions in a questionnaire survey: (a) because demographic data help to answer the research questions, and (b) because they help to describe the sample.
In the first case, there is a clear way in which the information about the participants informs the findings: for example, a researcher might be interested in finding out how family income or parental educational attainment impacts learning; or in finding the relation between teacher qualifications and teaching style. When such relations between the data collected and the expected findings are explicit, that’s great. What is less great is a tendency to collect any data that might conceivably be relevant, in the hope that an effect might show up in analysis: such an approach is very often a waste of respondents’ and the researchers’ time.
In the second case, the use of demographic data is ancillary: researchers collect and report data about their sample, so that readers might be able to account for similarities and differences across studies. In such cases, where researchers are interested in a summary description rather than individual responses, it makes sense to check whether such data is already available. For instance, it might not be necessary or efficient to ask every participant about their family income, if you already know that the school’s catchment area is a middle-class neighbourhood.
Such data are surprisingly easy to find: In our school, for example, summary information about the demographics of the students (age groups, family size, parental education, and family income) was available to researchers upon request. Useful information can also be found in census reports, commercial databases such as ACORN, or, perhaps, from the local education authorities. Published research which explicitly describes the demographics of geographical areas may also be found in the literature. In addition to saving time, using information from such sources ensures that the data are easier to compare across studies. So, in short, before embarking on data collection, just ask!
What’s wrong with collecting demographic data from scratch?
Including a demographics section in a questionnaire survey is associated with three potential problems: It risks alienating the respondents, it generates respondent fatigue, and it creates possible liabilities.
When personal questions are included in a survey, especially at the beginning of a questionnaire, they risk unduly alienating or alarming respondents. Even when the usual reassurances about confidentiality and anonymity are provided, some respondents may be reluctant to share information that they consider sensitive, or information through which they feel that they might be identified. In my experience, information about family income is considered sensitive by many Greeks, and students often avoid answering them. Other respondents may be uncomfortable sharing information about their family status, religious affiliation, languages spoken at home, etc. Asking such questions has a way of creating distrust, and should be handled tactfully.
Secondly, it seems like a bad idea to waste the respondents’ time, energy and good will, by making them fill out long forms with information that may not be strictly necessary. Long demographics sections cause respondent fatigue, which means that respondents might either quit the questionnaire before it is completed, or engage with the last sections in a very superficial way (e.g., by selecting the same answer in all items).
Finally, demographics sections risk making respondents more self-aware, or even identifying them. This is especially true in small-scale surveys, such as the ones typical of student research. To offer a personal example, I was the only male MFL specialist in the last school where I was employed, so when handed with questionnaires to complete, awareness that I was identifiable influenced which questions I answered and how I answered then. In addition, it has been my experience that many student researchers seem blithely unaware of the responsibilities involved in collecting personal and sensitive data, and they usually lack the experience and resources to comply with legal requirements for processing them.
How can a demographics section be improved upon?
From the paragraphs above, it should be clear that you had best avoid collecting information that you don’t strictly need. When it comes to collecting information that is indeed necessary, here are some tips:
- Avoid embarrassing the respondents: For example, some respondents may feel uncomfortable placing themselves in the highest age-group, or the lowest educational attainment group. You can easily avoid such problems by adding more possible responses to your questions: For instance, I often recommended that the questions asking the respondents’ age included options ‘51-60’, ‘over 60’, rather than just ‘over 50’, so as to avoid putting respondents on the spot.
- Allow respondents to opt out: Respondents should always be given the option of not answering any or all the answers in the demographics section. At minimum, a ‘prefer not to say’ option should be included in every item. You may also want to include a statement reaffirming consent at the top of the demographics section. Here’s a possible format: “This section asks questions about you. This information is necessary [insert brief justification]. The data you share with us will not be used to personally identify you, and will not be passed on to anyone else. If you prefer not to answer these questions, tick the following box …”
- Place the demographics section at the end of the questionnaire. This will help to minimize the effects of respondent fatigue (see above).
- If you only need the demographic information for descriptive purposes only, i.e., if you do not plan to analyse it in conjunction with other questions in the survey, consider placing the demographics section on a separate page. This can then detached from the rest of the questionnaire and analysed independently from the other data as an additional anonymity safeguard.
In summary, demographic sections in questionnaires should be designed on a strict ‘need-to-know’ basis; alternative sources of data must be considered before personal or sensitive data are collected; and their format and sequencing needs to be such that it does not impact other sections of the questionnaire.
This is the third in a series of five blog posts on designing more efficient questionnaire surveys. Previous posts looked into the wording, and bias, structure and sequencing of questionnaire items; in the posts that follow I will describe how scale items can be used to elicit information and give some tips on overall questionnaire layout. Till next time!