We recently worked with a new entrant into the healthy snack foods business who wanted to understand the market they were getting into. Specifically, they wanted to know the following:
Armed with these goals in mind, a survey was created using SurveyMonkey and distributed to the new entrant's target market via their newsletter (Protip: before you launch a product, build up a mailing list. It works to 1) validate your idea and 2) tighten the feedback loop on product decisions). A telemarketing service was also employed to call people and ask them the same questions. These responses were transcribed and sent to Repustate so the same analysis could be performed.
OK so that's the easy part; thousands of responses were collected. But the responses were what is referred to in the market research industry as "open ended" meaning they were just free form text as opposed to a multiple choice list. The reason being was this brand didn't want to introduce any bias into the survey by prompting the respondents with possible answers. For example, take question #2 from above. If presented with a list of snacks, the respondent might in their head say "Oh yeah, I forgot about brand X" and check them off as being a snack they think of, when in reality, that brand's product had slipped off their radar. Open ended responses test how deeply ingrained a particular idea, concept or in this case, brand, is within a person's consciousness.
But having open ended responses poses a challenge - how do you data mine them en masse and aggregate the results to come up with something meaningful? If you have a few hundred responses to read, maybe you hire a few interns. But what about when you have ten's of thousands? That's where Repustate comes in.
Fortunately, SurveyMonkey has a pretty simple to use API. Combined with Repustate's even easier to use API, you can go from open ended response to data mined text in seconds. Here's a code snippet that provides a good blueprint for how one can marry these two APIs together. While some details have been omitted, it should be relatively straightforward as to how you can adapt it to suit your needs:
So with very few lines of Python code, we've grabbed the open ended responses, processed them through the named entities API call, and can store the results in our backend of choice. Let's take a look at a sample response and see how Repustate categorized it.
Q: If you were having friends or family over for a casual get together (summer BBQ, catching up etc.) what kinds of snacks would you serve?
A: I usually put veggies out, like carrots, celery, cucumbers etc. etc. and maybe some dip like hummus and crackers.
Running that response through the Repustate API yields this information:
Armed with this analysis, we then aggregated the results to see which categories of food, and which brands were being mentioned the most frequently. This helped our client understand who they were competing against.
As it turns out, it was plain old vegetables that were the biggest competition to this new entrant, which is a double edged sword. On the one hand, it means they don' have to spent the marketing dollars to compete with an entrenched incumbent who dominates most of the shelf space in supermarkets. On the other hand, it's a troubling place to be in because vegetables are well known, cheap, and are viewed as healthy (obviously).
We're fortunate to be living in a time when so much data is at our disposal, ready to be sliced & diced. We're also cursed because there's so much of it! We need the right tools and a clear mind to handle these sorts of problems, but it's possible.