Q: Let’s first start with the basics: what is good- and bad-quality data?
Sasha: “We define ‘good quality’ as real data coming from real people, and serving a specific purpose. This data has four characteristics; it is accurate, complete, consistent, and valid.”
Alicia: “Historically, we have considered participants, and by extension their data, as either good or bad. However, bad-quality data varies in severity. For example, all participants can make mistakes in their responses, such as a typo here or the wrong click there, and this may reduce the quality of the data. We are not concerned about these minor data-quality issues. What keeps us up is severely bad quality, which can arise for one of two reasons: fraud, and lack of engagement.”
Q: What is the difference between fraud and lack of engagement amongst survey respondents?
Sasha: “Fraudulent responses refer to the intentional falsification of data. One example is a bot completing a survey multiple times. These types of responses are highly undesirable since the data is false and often meaningless. On the other hand, our data can also be severely comprised when participants are disengaged – they complete the survey as quickly as possible with the least amount of effort. The resulting data tends to contain a minimal level of detail, very little variability, and although not deceptive, it does not contribute much to the overall insights.”
Alicia: “Exactly, so both motivations result in poor-quality data, but for completely different reasons. In the first instance, a purposefully deceptive participant is providing data through a bot or is untruthful about their experiences. In the second instance, the participant is providing poor-quality data because they are not motivated or engaged in the survey. We cannot influence the behaviour of participants who fall in the first category – but we have measures in place to avoid recruiting these types of participants. However, we can influence the behaviour of participants in the second category and reduce the likelihood of them becoming disengaged.”
Q: Let’s dive into the concrete measures and processes later on. Can you first tell us a bit more about your team’s general approach to safeguarding data quality?
Sasha: “The Network Quality team supports all InSites projects by ensuring high-quality participation, and consequently high-quality data. We want to empower the consultants with reliable, valid data they can trust, so that they can deliver outstanding deliverables and insights.
The team is new – it was formed in September 2021 – and has adopted an ever-evolving approach: we are constantly studying, developing, and updating our knowledge and approaches so that we can keep up with current challenges and existing research. We also use a grassroots approach, by getting feedback from our colleagues about the challenges that they experience and the trends that they have noticed in their data, so that we can stay up to date.
Based on all these approaches, our knowledge, and our expertise, we have put into place extensive quality-check processes that follow every step of the research process, including project set-up, data collection, and post-data collection to safeguard the best possible data quality.”
Q: And specifically, what processes does your team follow to ensure superior quality data?
Sasha: “The first step focuses on a carefully crafted research design, tailored to our clients’ needs. From careful source selection and innovative thinking, we ensure that questionnaires help us safeguard data.
A well-designed, well-phrased, and engaging questionnaire that is tailored to the project’s objectives ensures excellent quality data in at least two ways. First, a questionnaire that is properly designed and contains appropriate questions is more likely to yield data that answers the research question. Second, an engaging questionnaire is more likely to keep participants focused, making sure they complete all questions, and this subsequently result in better-quality data.”
Alicia: “So, as an example, assume that our client was interested in a sensitive topic, like contraceptive use and preferences, or personal saving habits. A poorly designed questionnaire is one that begins with these personal questions; participants might be put off by being asked such personal questions right away, and therefore might decide to terminate the survey.
It is important to remember that our participants are human; they have insights, experiences, and opinions that we are interested in, and that is why we want their input. However, participants can also get bored, have bad days, make mistakes, or feel rushed. Knowing this, we must consider how the research experience can affect our participants’ behaviour.
If surveys are too long, too boring, or too demanding, then participants will no longer provide good-quality data. Therefore, it becomes increasingly important to optimize the participants’ experience and show that we value their time and effort.”
Sasha: “There are many things that we can do as researchers to ensure the quality of the data that the participant provides. Everything from deciding on the right question type (for example, rating scales where you can to capture quantitative variability, and open questions for richer insights), preventing boredom by varying the types of questions asked; wording the question clearly so that the participant understands what is asked, avoiding jargon, formal language or internet speak; ordering and numbering the answer options properly; and even the overall flow of the questions in the questionnaire can influence participants.”