Weighting

Introduction

Weighting is a data processing technique that allows researchers to change the value of a particular respondent’s answers relative to the rest of the responses in a dataset. This allows researchers to correct for sampling imbalances by adjusting the data to be more representative of their desired target population.

Examples:

This technique is commonly used to make a dataset more closely represent the Census. For example, a polling company has just run a survey of n=2,000 registered voters, but they discover that there is some skew in their age groups:

weighting

Thus, they need to create a weight to increase the value of the underrepresented age groups (like 35–44 year olds), and decrease the value of the overrepresented age groups (like 18–24 year olds). This is the method they use to create their weight:

weighting

When they implement their weight in their data analysis tool and check the percentages for the age groups, the percentage for each age group now matches their target Census percentages. The weights used in this example are more ideal because they hover close to 1. Very large weights can result in skewed data, so it is important to keep track of how large your weights are. Weights can become exceptionally large when several different factors are being factored into a weight (such as trying to weight the data based on age, gender, region, and ethnicity). The more variables that are factored into a weight, the greater the chance that the weights can become too large.

A recent example of this is highlighted in an article discussing how one man, due to his particular demographic grouping and voting preferences, was “weighted as much as 30 times more than the average respondent, and as much as 300 times more than the least-weighted respondent.”

Thus, it is important to always review any weighting variables before moving forward with analysis, and to be sensitive to the effects of weighting when reviewing and analyzing data.