Survey Best Practices

Once you know the business problems that need to be addressed in your survey, writing the questions can be challenging for even the most experienced researchers. We’ve compiled some of our best tips and tricks to help you write a successful survey that captures actionable insights for you and your stakeholders.

Use Randomization to Control for Order Bias

Order bias is the bias in research data that results from the tendency for respondents to click items shown at the top of a list, as these are the first items they see.

All survey questions are prone to the effects of order bias. This is especially true for long option lists that contain 8 or more items (7 or less is ideal), as respondents are unlikely to read through the entire list and select every option that is relevant for them.

  • Acquiescence bias can also magnify the effects of order bias when the positive options are fixed at the top of the list.

To control for order bias, the easiest solution is to randomize the list. When the order of the list is important, the list can be randomly reversed (so the list will randomly show in ascending/descending order).

  • However, it is important that this random reversal works in tandem with all the other randomly reversed lists. Respondents can become confused if they see a positive to negative scale in one question, and then a negative to positive scale in a following question. It is important to maintain consistency so respondents find it easy to understand the question and choose the best answer.

However, there are always option lists where the list must remain static. In such situations, it is important when analyzing the data to remember that order bias can be an issue.

Here is an example of where randomizing can be helpful:

When you go shopping for shirts, what one color do you always look for?

  • Purple
  • Green
  • Black
  • White
  • Blue
  • Red
  • I don’t look for a particular color

The main list of colors should be randomized to control for order bias. However, “I don’t look for a particular color‘ should be anchored at the bottom, and the randomization should not include this option. Including it in the randomization can confuse respondents. Those who know that they don’t look for a particular color will not expect to find a relevant option in the list of colors. Therefore, since it is exclusive from the list, it should be anchored at the bottom.

Use Randomization and Neutral Language to Control for People Pleasers

There is a known tendency for respondents to agree with research questions or statements, and the bias that results from this behavior is called Acquiescence Bias.

This can result in data that skews towards positive responses, so researchers will not have an accurate idea of what percentage of the population would select the neutral or negative responses. This can happen with any question that includes both positive and negative options.

Here is an example of two versions of the same question that can be impacted by acquiescence bias:

How willing would you be to buy a toothbrush that automatically dispenses toothpaste with the press of a button?

  • Very likely
  • Somewhat likely
  • Neutral
  • Somewhat unlikely
  • Very unlikely

Would you like to buy a toothbrush that automatically dispenses toothpaste with the press of a button?

  • Yes
  • No
  • Maybe

In order to combat this bias, it can be helpful to change the order of the options. If the positive options are not always at the top, then order bias is limited in its ability to compound the effect of the acquiescence bias.

Another way to control for acquiescence bias is to keep the language as neutral as possible, so respondents do not feel influenced by the language to respond a particular way.

Watch for Questions that have More than One Question (and Answer)!

Despite growing needs for insights into consumer perceptions and behaviors, recommended survey length and respondent attention spans have remained static, and even decreased. Ideally, a survey should not last more than 15 minutes.

To combat decreasing budgets and survey lengths, it may be tempting to include double-barreled questions, which are questions that ask for information on more than one question (or have answers that provide information on more than one question). However, such questions can confuse respondents, which results in poor data quality, so it is best to avoid double-barreled questions whenever possible.

Here is an example of a double-barreled question, with a triple-barreled list of options:

Do you have a big-screen tv, and what brand is it?

  • Yes, Samsung
  • Yes, LG
  • Yes, Toshiba
  • No
  • No, but I used to
  • No, and I don’t want one

The answer list provided is trying to get at 3 different answers: whether respondents have a big screen tv, what brand the tv is, and why some respondents don’t have a big screen tv.

This is confusing for respondents, as it’s difficult to identify the one answer combination that applies to them. Another difficulty with this question type is that the answer list does not capture all possible responses, so respondents could be forced to choose an answer that does not reflect their true feelings.

Thus, data from such questions is not going to be as high quality as it would be if respondents only provided one answer at a time to one question. Whenever possible, it is best to only ask one question, and only show the options that answer that one question. If logic is needed based on particular answers, such notes can be included in the survey design, and added when the survey is programmed.

Limit the Use of Free Response Questions

Unstructured questions (also known as free response questions) ask respondents to provide answers in their own words into a text box.

Since answers in respondents’ own words are qualitative in nature, and can come in many different forms, they are difficult to categorize and analyze in a quantitative manner. As such, this question type is best suited for qualitative research, where there is a moderator or interviewer to probe into participants’ initial statements to get at the core assumptions at the foundation of their ideas.

  • Qualitative research (such as focus groups, ethnographies, shop alongs, etc.) is used to explore initial reactions, ideas, and concepts associated with new products or services.
  • Quantitative research (such as online surveys, phone surveys, etc.) is best used after the qualitative research has established the foundational issues for the business problem at hand. It is meant to quantify these ideas and concepts on a macro level, to understand their prevalence within their consumer audience.

For quantitative research, it’s best to keep most questions closed-ended. Respondents fatigue at having to answer too many free responses. &ldquyo;Why” questions especially usually yield poor responses. Think about asking someone why they like the green version of a product over other colors, for example. You are likely to get responses like “I just do” or “I don’t know.” These responses are valid, but obviously not very fleshed out.

Now imagine if you asked the respondent another free response, such as “What else do you like about the green version?” With each free response, you will get lower quality responses (i.e. &ldquyo;I don’t know”) or in ways we don’t accept (“Why are you asking”, “bbbbb” or expletives).

A better way is to provide a list and supply an “Other, please specify.” For example:

What do you like about the green version of the product? Please select all that apply:

  • It looks more flattering
  • It looks more professional
  • It reminds me more of nature
  • It has my favorite color
  • Other, please specify
  • No, and I don’t want one

This format makes writing an answer more volunteering than demanding, so you will get better quality free responses. You are also prompting them to give more than one word answers, again increasing the quality. Finally you are making coding the answers easier on yourself, since most answers will fall into certain categories anyway.

Design Rating Scales Based on Research Needs and Respondent Fatigue

Rating scales are designed to quantify the presence and magnitude of attitudes and emotions.

Likert scales are the most popular type of scale. In their most traditional form, Likert scales are a 5-point scale, where 1 is the lowest point on the scale, and 5 is the highest point on the scale. Traditionally, a Likert scale is bipolar, meaning that it captures both extremes of an attitude (strongly agree to strongly disagree), and it is an odd-point scale, meaning that it allows for neutral responses.

Below are some examples of various types of scales that measure satisfaction:

  • Odd-point (has 5 points, but could also have 3, 7, 9, 11 points), Bipolar Scale (Measures the strength of two variables, satisfaction and dissatisfaction). This is the traditional Likert scale:

    • Very dissatisfied
    • Dissatisfied
    • Neither satisfied nor dissatisfied
    • Satisfied
    • Very satisfied
  • Odd-point (has 5 points, but could also have 3, 7, 9, 11 points), Unipolar Scale (Measures the strength of one variable, satisfaction):

    • Not at all satisfied
    • Slightly satisfied
    • Moderately satisfied
    • Very satisfied
    • Extremely satisfied
  • Even-point (has 4 points, but could also have 6, 8, 10 points), Bipolar Scale (Measures the strength of two variables, satisfaction and dissatisfaction):

    • Very dissatisfied
    • Dissatisfied
    • Satisfied
    • Very satisfied
  • Even-point (has 4 points, but could also have 6, 8, 10 points), Unipolar Scale (Measures the strength of one variable, satisfaction):

    • Not at all satisfied
    • Slightly satisfied
    • Satisfied
    • Very satisfied

When choosing which scales to use, it can be helpful to consider these factors:

  • Even vs. Odd: If it’s possible that respondents could have a neutral opinion, then it’s best to use an odd-point scale. Forcing respondents to have an opinion when they don’t can be frustrating for them. However, if they’re unlikely to have a neutral opinion, or if it’s absolutely critical for your decision-making process that respondents have an opinion, then using an even-point scale would be best.
  • Bipolar vs. Unipolar: Unipolar scales should be used when you need to measure the strength of one variable (such as satisfaction, influence, importance, etc.). Bipolar scales should be used when you need to measure the strength of opposing ends of a variable (such as satisfaction vs. dissatisfaction, positive vs. negative, agree vs. disagree, etc.)
  • Number of scale points: The fewer the scale points, the easier it is for respondents to choose the point that most closely resembles their experience. However, it can be important to have more nuanced responses for analysis, such as knowing whether there are differences in responses between the top 2 points in the scale and the top 3 points in the scale. Larger-numbered scales (such as 7,8,9,10,11) are best suited for situations where that extra detail is important for analysis. However, these scales are more fatiguing for respondents, and can create more work for researchers, so they should be used with caution. 5-point scales are popular because they are relatively easy for respondents to navigate and for researchers to re-code and analyze.

Re-Read Questions to Catch Implicit Alternatives

An implicit alternative is an alternative alluded to, but not explicitly articulated or defined, in a question text or statement.

Here is an example of a question text that includes an implicit alternative:

Do you prefer to buy shoes at the mall?

  • Yes
  • No

The question asks for preferences for buying shoes, but it does not explicitly ask what activity or product is being passed over in favor of buying shoes. Respondents must imagine what comparison they should make here, so an answer of “Yes” or “No” could mean many different things for many different people.

  • Some people may say “No” thinking that they prefer to buy pants at the mall, or they prefer to buy shoes online.
  • Others may say “Yes” thinking that they prefer to buy shoes at the mall rather than at a stand-alone store, or they prefer to buy shoes at a mall and socks online.

Whenever possible, it is important to provide concrete examples and comparisons so respondents have the full context of what the researcher wants them to assess.

A shoe manufacturer might be more interested if the question changed to this:

When given a choice, where do you prefer to buy shoes?

  • Online
  • In the mall
  • At a stand-alone store
  • Other (please specify)

An executive for a mall might be more interested if the question changed to this:

When you’re at Valley View Mall, what items are you most likely to buy?

  • Dresses
  • Tops
  • Skirts
  • Pants
  • Shoes
  • Accessories
  • Other (please specify)

See the Survey from a Respondent Perspective

Throughout the survey design process, and especially when the survey design is considered complete, it is important to consider the respondent experience. The longer and more complex a survey is, the more it taxes respondents’ cognitive abilities. This mental fatigue that results from taking a survey is called survey fatigue.

There are many factors that can fatigue respondents. Grid questions and free-response questions (especially ones that require complex and detailed answers) are the most taxing question types. Surveys with multiple grids and/or free responses will have poorer quality answers as respondents progress through the survey. In fact, the quality of a free response decreases the closer it is to the end of the survey.

  • Wherever possible, it is best to provide lists of options that respondents to choose from, and when grids are needed, to keep the lists and columns short and easy to understand. The recommended maximum length for a list is 7 items or less.
  • When it comes to survey length, the recommended maximum length is 15 minutes. Researchers and their clients may want to go up to 30, 40, 50 minutes, but this places too much cognitive burden on respondents, as their ability to dedicate their full attention and provide quality responses steadily diminishes over time.

When in doubt, researchers should consider whether they would want to take the survey. If a researcher reviews a survey and thinks she would drop out or start providing poor answers, she needs to revise the survey until it’s something she would want to take herself.