Market research surveys are a great tool, but they pose a challenge. How do you data mine the responses? Multiple choice questions are easy to analyze, but they lack the nuance and authenticity that often comes to the surface from open ended responses.

Asking the right questions

Repustate worked with a new entrant into the healthy snack foods business who wanted to understand the market they were getting into. Specifically, they wanted to know the following:

  1. Which foods do people currently eat as their “healthy snack”
  2. Which brands do consumers think of when they hear the word “snack”
  3. Was there anything about the current selection of snack foods that consumers didn’t like?
  4. If you were having friends or family over for a casual get together (summer BBQ, catching up etc.) what kinds of snacks would you serve?

Armed with these goals in mind, a survey was created using SurveyMonkey and distributed to the new entrant’s target market via their newsletter. A telemarketing service was also employed to call people and ask them the same questions. These responses were transcribed and sent to Repustate so the same analysis could be performed.

OK so that’s the easy part; thousands of responses were collected. But the responses were what is referred to in the market research industry as “open ended” meaning they were just free form text as opposed to a multiple choice list. The reason being was this brand didn’t want to introduce any bias into the survey by prompting the respondents with possible answers. For example, take question #2 from above. If presented with a list of snacks, the respondent might in their head say “Oh yeah, I forgot about brand X” and check them off as being a snack they think of, when in reality, that brand’s product had slipped off their radar. Open ended responses test how deeply ingrained a particular idea, concept or in this case, brand, is within a person’s consciousness.

But having open ended responses poses a challenge – how do you data mine them en masse and aggregate the results to come up with something meaningful? If you have a few hundred responses to read, maybe you hire a few interns. But what about when you have tens of thousands? That’s where Repustate comes in.

Leveraging multiple APIs

Fortunately, SurveyMonkey has a pretty simple to use API. Combined with Repustate’s even easier to use API, Repustate could data mine the open ended responses in seconds. Let’s take a look at a sample response and see how Repustate categorized it.

Q: If you were having friends or family over for a casual get together (summer BBQ, catching up etc.) what kinds of snacks would you serve?

A: I usually put veggies out, like carrots, celery, cucumbers etc. etc. and maybe some dip like hummus and crackers. If my sons friends come over, it's usually Doritos.

Running that response through the Repustate entity extraction API call yields this information:

    "themes": [
    "entities": {
        "celery": "food.vegetable",
        "crackers": "food.other",
        "carrots": "food.vegetable",
        "cucumbers": "food.fruit",
        "hummus": "food.other",

Armed with this analysis, we then aggregated the results to see which categories of food, and which brands were being mentioned most frequently. This helped our client understand who they were competing against.


As it turns out, it was plain old vegetables that were the biggest competition to this new entrant, which is a double edged sword. On the one hand, it means they don’t have to spend the marketing dollars to compete with an entrenched incumbent who dominates most of the shelf space in supermarkets. On the other hand, it’s a troubling place to be in because vegetables are well known, cheap, and are viewed as healthy (obviously). But now the client knew where to position itself in the market and specifically how to direct its ad spend during product rollout.