Advertisement

Facefacts Research advertisement

Find your next agency

Powered by the Research Buyers Guide

Breaking Blue (a Cello Health Insight Company) – 20 March 2019
Source Breaking Blue

Automated analysis of open-ended text survey data is an appealing prospect.  It offers benefits beyond cost savings and reduced lead times. Done successfully, it allows direct and in-depth access to participants’ views, expressed in their own words and without the intervention of an interviewer and a coder (which can lead to human error and variability).

Today, text analytics is a huge business and is among the most popular innovations within the current research landscape. However, within the research industry, there has been little change in usage in recent years, and awareness of the options available appears to be limited.   While just under half of research agencies report using, or are considering using, text analytics; awareness of the possibilities it offers appears to be somewhat limited, with many agencies describing straightforward coding exercises or Word Clouds.

We identified a need to look more closely at the true strengths of different approaches, the main barriers to their adoption, and how these might be overcome. For our experiment, we decided to use two approaches that can be applied easily to any sample of text—specifically, they both use a lexicon-based method rather than machine learning. This means that we focused on methods that any research company could apply “off the shelf” rather than methods that would need developing or adapting to the task in hand.

We also felt it would be useful to include more than one market. Leaving aside the issue of translation from other languages, we decided to include two countries that use different variants of English: the UK and India. We chose the UK and India because our technology clients have a specific interest in both of these markets, and we believed understanding more about general lifestyles in these markets would provide valuable background to the survey findings. 

We chose as our topic something that everyone understands and has an opinion about: work and leisure (or play).  In both India and the United Kingdom, we carried out more than 600 interviews with a sample of people aged between 18 and 65 years via a short online survey, including a spread of gender, age, and region in each market.  In the interview, we covered people’s attitudes to work and play, and the words they associate with each, alongside some contextual questions around demographics, Internet and app usage. The survey yielded a wealth of interesting data.

Using text responses we then contrasted two tools in analysing the output: Q’s text analysis component and Google Cloud Natural Language. We chose these tools as they can each be easily applied to survey data but are based on different analytic principles.

Analyzing our text data yielded learnings both about the associations with work and leisure in two contrasting cultures and about the understanding that analytics tools can bring to those associations. 

When it comes to work, some of the strong associations are not particularly surprising. The thickest line is between “money” and “time” because these are the two terms most often mentioned together. To many people, “work” means simply “time spent earning money.”  However, other associations are less predictable. For example, “hard” is linked to “boring” as we might expect but also to “fun.” While it might be initially surprising that “hard” is linked to “fun,” this association indicates the pleasure that can come from the challenge of hard work.  The word with the strongest relationship with work satisfaction is “boring,” and those using this word report much lower than average satisfaction.  Among the survey participants who do not use the word “boring,” the next most powerful word is “stress.” Those who mention “stress” but not “boring” report a higher level of happiness.

We conducted a similar analysis around leisure time, relating a scalar score that participants ascribed to their happiness with the amount of leisure time available to them, with their unprompted associations with leisure time.  Immediately, it is apparent that there are far fewer frequent associations in the top 10: people are much more varied in what they do and, therefore, associate many more words with the concept of “leisure.” As we might expect, “relaxing” and “fun” are mentioned frequently. “Friends” and “family” have a similar frequency, but the associations are different. “Family” has very strong links to both “fun” and “relaxing,” whereas “friends” is more weakly associated with these key features of leisure time; perhaps, the opposite of what we would expect. “Walking,” “reading,” and “TV” are all in the top 10 and have stronger links with each other than with “relaxing,” “fun,” or “enjoyment.” This exposes a difference in the way people respond: some provide emotions or feelings, and others specify activities.

We also found some surprising differences between the output of the two tools and between the text analysis metrics and scalar data.  For example, we see that the two approaches give different results when applied to the open-ended text from the UK. Q gives “leisure” a higher score than work, whereas Google does the opposite. In India, the scores are closer together, which reflects the general learnings from the study: work is generally more positively viewed in India than in the UK.

We concluded by discussing some of the key contemporary themes in text analytics and the likely future role of this method within market research and insight.  We believe there is clearly a need for some consensus on the processes and features of an optimum text analytic tool for insight researchers.  We have shown that different analytic tools can give different results when applied to the same data set. We encourage researchers across the industry to do more testing and exploration, as we have in this study.  We also concluded that there will almost certainly be some consolidation within the market of tools available. There is probably insufficient room for all of them and there will be more comparative evaluation, so that the more sophisticated and accurate tools survive.

 Click here to read the full article

Advertisement

Facefacts Research advertisement

Get the latest MRS news

Our newsletters cover the latest MRS events, policy updates and research news.