Predicting consumer behavior in online communities via Artificial Intelligence

Co-written with Steven Debaere, Data Scientist at InSites Consulting and PhD Candidate at IÉSEG School of Management. If you were active in the market research industry in the past ten years, you probably experienced the important evolutions of Market Research Online Communities (MROCs) or as we like to call them Consumer Consulting Boards. By going online, we could easily reach many consumers and switch to different geographical areas. Going mobile allowed us to better immerse in consumers’ daily life. And going structural resulted in ongoing conversations with anyone, anytime and anywhere. Communities helped market research to step to a higher level and are enormously popular in the industry.
The 2016 GRIT Report indicates that 58% of clients and 59% of suppliers adopt communities within their business. This popularity will only continue to grow in the future. Industry watcher Ray Poynter expects that, whereas online communities only take up 5% of the market research budget in 2016, this will grow to 70% by 2026.

Market Research Online Communities: The pursuit & challenge of sustainable success

But are we ready to guarantee this success in the future? Are we well prepared?
It’s important to mention that we can rely on industry experience and best practices to pursue community success. For example, we already know how to use different recruitment platforms to identify those members that are interested and interesting for our communities. Right now, we have expert knowledge on how to manage and moderate community dynamics to achieve favorable conditions to do market research. Moreover, we have done extensive research on gamification and engagement techniques to encourage participation. More info is available on
But can we use the same techniques and follow similar successful past practices in the future also?
The answer is “maybe not“, mainly due to two challenges, which will only become more important as community adoption or volume increases in the future. First, in essence, MROCs are data-loaded environments which accumulate additional data every day. This big data characteristic puts pressure on the moderator’s resources to deal with and analyze community content. Second, member disengagement is a fundamental problem for healthy research communities. When members participate insufficiently in the topics which are posted in a community (low quantity), or what they say does not contain anything valuable (low quality), the moderator may be unable to derive useful consumer insights from the community. Additionally, when more communities will be organized, in the end we may all go for the same pool of participants, putting pressure on the members’ motivations to participate in the community.
Therefore, it is important to explore new approaches on how to increase the participant’s long-term value and to effectively deal with the problem of member disengagement.

Proactive community management: Community Moderation 2.0

Proactive community management is a moderation practice to anticipate predicted member disengagement and take proactive actions to prevent disengagement behavior from negatively impacting the community. Proactive community management leverages the data-rich environment of the community and relies on technological innovations to support the moderator in managing the community more effectively. Proactive community management can be considered to be the real-life realization of the movie Minority Report in research communities and consists of a three-step approach: detect, predict and prevent.
In research communities, moderators are usually on their own and have to rely on themselves and their own efforts to manage the community and combat member disengagement. But this is rather crazy. On the one hand, communities are data-loaded environments, but we use this data only in a limited way to derive consumer insights from it, so we are not really doing anything extra with it. On the other hand, already available technologies allow to exploit data effectively and get more out of it, like text mining, Natural Language Processing and behavioral analysis. So why not adopt these techniques and use them on community data to identify community insights which could support the moderator in managing the community?
That is exactly what we did. In our research project, we explored 150,000 posts from 3 years of data, resulting from 10 communities for 7 brands; we applied text mining and behavioral analysis techniques to construct about 7M data points to unravel valuable community insights. We then used these variables to detect member disengagement and identify relevant predictors. Member disengagement can be measured in terms of quantity and quality dimensions, respectively by calculating the percentage of actively participated-in community topics and the number of cognitive words a member uses per post. Cognitive words such as because and think reflect the effort that has been put into the post and is identified as a reliable indicator for the posts’ quality.
Now how can the detection of member disengagement be made practical for the moderator in a community context? By using a cut-off value to distinguish between high and low activation levels of participation and combining quantity and quality dimensions, we can come up with a four-quadrant framework to classify community members and identify four different community behavior profiles.
Detect member disengagement
But why only detect and look at the past, when it’s possible to consider the future? Why only detect, when we can predict? We can also do this in real life, as has been proven by many successful applications ranging from Facebook to the Obama campaign. Predictive analytics and Artificial Intelligence allow to predict future events. We can adopt this in a community context by creating prediction models to predict member disengagement, low quantity and low quality behavior. The output is a probability and reflects the risk that a member will demonstrate disengagement behavior in the future.
How can we make predictions? We leverage historical data and use machine learning techniques to identify patterns in past data that explain future behavior. The intuitive explanation is that we try to find habits that explain future disengagement behavior. Human behavior is very predictable; this also goes for the community context. We can then adopt the output of the two prediction models in our four-quadrant framework to give insights into the future behavior of each participant. The moderator can use the prediction models, historical data and this framework, in order to identify what each participant’s future profile will be.
You may wonder whether these models are reliable. Evaluating our models on unseen data allows us to assess the quality of the prediction models. We see that the accuracy is rather good as for low quantity; we can make correction predictions in 78% of the cases, while making correct classifications for low quality 78% of the time. Further clarifying these numbers, knowing that randomly deciding between high and low activation levels corresponds with a 50% prediction accuracy, you can see that our models already perform better than that. You may wonder how this compares to the prediction capability of the moderator. We notice that in fact it does not really matter. When scaling to the whole community member population or making fast predictions, the moderator can never beat the model. So overall, it’s better to rely on the models for the prediction phase. The position of the moderator becomes more important in the third step.
Predict member disengagement
Now that we can detect and predict member disengagement, why not take actions on our predictions, so we can anticipate expected member disengagement to prevent negative community impact? We can follow a three-stage approach in particular, where we combine the strengths of the first two steps with those of the moderator in the third step:

  1. Identify: the prediction model predicts each member’s future profile; the framework allows to classify each member into one of the four quadrants to identify their future participant behavior.
  2. Contextualize: historical community data and CRM info can be retrieved to provide the right context for the moderator. Moreover, actions could even be recommended to proactively correct disengagement behavior; these have worked successfully in the past.
  3. Finalize: the moderator interprets all the information from the previous steps and uses the intuition and creativity to finalize the prevention campaign by deciding which action needs to be taken.

You may wonder which corrective actions we should take. To answer that question, we can rely on industry experience on engagement actions. The only differences are that instead of using a one-size-fits-all approach for all the members and using it reactively when negative impact has already been recognized, in our approach we personalize the prevention action for the individual member and use it proactively to prevent destructive behavior from impacting the community.
Prevent member disengagement

Yes, please” or “So what“?

What if you wish to adopt this approach in your community? It is important to take two aspects into account. First, favor white-box prediction models over black-box prediction models and value understanding over predictive accuracy. You can have the best prediction model in the world, if nobody understands how it works and what it does, it is less likely to ever be used in the business. Second, it’s important to explore these approaches in a multi-disciplinary group. By considering the opinion of every stakeholder, everyone will become more aware of what it can actually mean for the business, which will ease the adoption process. What if you choose not to adopt this approach in your community? This would be a lost opportunity to effectively personalize engagement actions and proactively manage the community.
This study received the Marketing Science Institute Research Accelerator Award and was nominated for the German Online Research (GOR) Best Practice Award 2017.
Eager for more? Get your free download of From Hype to Reality: Artificial Intelligence in market research. 

You might also be interested in

Black man with Rubik's cube

Keep your strategy in tune with consumers’ needs via Price Sentiment Trackers

Written by Yvonne Feucht

How tracking price sentiment helps you ensure that your price and product strategies stay in tune with consumer and retailer needs

InSites Consulting expands European footprint with the acquisition of Happy Thinking People

InSites Consulting expands European footprint with the acquisition of Happy Thinking People

Written by Anke Moerdyck

Strengthening our European footprint, we’re excited to announce our latest acquisition in the region with Happy Thinking People, headquartered in Munich and spanning Germany, Switzerland, and France. Happy Thinking People was founded in 1989 as a qualitative boutique, and today an international market research and innovation consultancy, ranking #1 on Innovation and Creativity in Germany ( 2021).

Insight Activation - People on power box

The 4 C’s of Insight Activation

Written by Lisa McFarland / Tom De Ruyck

Discover how you can activate internal stakeholders, turning insights into action and business impact. Understand the activation spectrum.