This article describes how to create weights from variable(s) in Q. Weighting is a technique which adjusts the results of a survey to bring them into line with some known characteristics of the population. For example, if a sample contains 40% males and the population contains 49% males, weighting can be used to adjust the data to correct for this discrepancy.
- Additional techniques
- Combining categories in an adjustment variable
- Creating two-dimensional weights
- Recomputing weights for each category in a variable (e.g. wave-based weighting)
- Making weights sum to sample size or target values
- Selecting a design weight variable
- Specifying upper and lower bounds for weights & specifying alternative weighting algorithms
- Saving a weight variable
- Applying weights
- Troubleshooting errors in weights
- One or more variables in your data set that you want to use to as adjustment variables. These variables can be categorical or numeric.
- The targets for each category in the weight variable. These can be percentages or population counts.
- Pick One or Number questions that can be used as adjustment variables. You cannot use multi-choice questions.
- Familiarity with weighting survey data. See: How to Weight a Survey
Specifying a single adjustment variable
- Go to the Variables and Questions tab, right-click, and select Insert Variable(s) > Weight
- Use the Adjustment Variable(s) box menu to select a categorical or numeric variable you want to weight by. Then specify whether your targets are Percentages or Population/Count (see Making weights sum to sample size or target values).
If your weight targets are Percentages, the total must sum to 100. When using Population/Count, this indicates the target numbers are populations (also known as counts). If Population/Count are used as targets across multiple adjustment variables, they must all have the same total population/count.
- Click the button once it becomes enabled to add the new weight to your data set.
- If the button is greyed out or disabled, check the Weight diagnostics at the bottom right for any errors or issues.
Using targets on percentages
If your desired targets are 60% Male and 40% Female, you would specify 60 for "Male" and 40 for "Female" as below:
Using targets based on population / count
According to U.S. population estimates, in 2020 there are 162.83 million males and 166.24 million females.
Specifying additional adjustment variable(s)
If you want to weight by more than one variable, click the Additional adjustment variable set button and select the variable you want. When you add a second adjustment variable, Q will start using Rim weighting or raking. For example, if you want to not only weight by Gender but also how frequently respondents drink coffee, do the following:
- Add the desired variable, which in this case is Coffee:
In the lower-right of the window, additional information about the weight is displayed next to the button:
These values are as follows:
- Minimum: Minimum weight value
- Maximum: Maximum weight value
- Maximum / Minimum: the ratio between the maximum and the minimum
- Effective Sample Size: Effective Sample Size (ESS) is an estimate of the sample size required to achieve the same level of precision if that sample was a simple random sample.
Combining categories in an adjustment variable
You can drag and drop the categories of a variable together to combine them within the weight tool. For example, in the Coffee variable, if you wanted to combine 4 to 5 days a week with Every or nearly every day, simply:
- Click the Every or nearly every day category
- Drag and drop it on the 4 to 5 days a week category.
Note: your percentages will revert to 0's when you drag and drop one category on another so you will have to specify them again if you haven't already.
Creating two-dimensional weights
Additionally, within an adjustment variable in the weighting tool, you can add additional variables to create two-dimensional weights. Use the Select Adjustment Variable(s) drop-down to the right of where you selected your adjustment variable.
For example, if you wanted to weight your data according to Income by Gender, you would specify:
Recomputing weights for each category in a variable (e.g. wave-based weighting)
In some cases, you want to calculate a weight based on group membership or time period. For example, in longitudinal or tracking studies when the data is collected in waves, it may be necessary to recompute the weight for each date period (week, month, etc.). That way each wave uses the same weighting targets, but each wave's weight is calculated separately.
By default, weights in Q are computed for the entire data set. Using the Recompute feature, you can enter the targets that will be used for each wave, rather than the total sample. To do this, use a Pick One question that has each "wave" as a category.
As with all Q's weighting, the weights are recalculated whenever new data is imported, so this option enables the weighting tool to automatically compute new weights for new waves as they appear in the data.
To recompute weights by category:
- From the Recompute weights for menu, select the grouping variable. In this example, the variable is called Interview Date
The categories of the variable selected in that box will be used to divide up the cases and the weight will calculate separately for respondents in each of those categories.
Making weights sum to sample size or target values
When sample size is selected in this drop-down, Q will ensure that the final weight produces Population sizes that sum to the data file's sample size. When target values is selected, the numbers entered as targets are used to produce the final weight, so the total Population on tables will equal the sum of the population targets entered in the weighting tool.
In this example, the unweighted sample size is 327 but the weight has targets of 162,830,000 for men and 166,240,000 for women, so the sum of the targets in the weighting tool is 329,070,000.
Selecting a design weight variable
Design weights are used to compensate for non-proportional stratification. For example, in a population containing 100,000,000 men and 100,000,000 women, if you used quotas or stratification to achieve a sample of 80 men and 20 women, then a design weight is created to take this non-proportional stratification into account. To learn more about Design Weights, click here.
To use a design weight in Q, use the Select Design Weight Variable drop down box to select the design weight variable in your data set.
Specifying upper and lower bounds for weights & specifying alternative weighting algorithms
On the lower left, the Minimum weight and Maximum weight boxes allow you to specify upper and lower bounds for your weights. This technique is called trimming and is designed to ensure that the weight factors are calculated to within certain bounds. For example, if you if you don't want weights larger than 3, then set maximum to 3 and minimum to 0. If the bounds you set are not possible, you will get an error message to that effect.
You can also change the algorithm as well. The choices are Linear, Raking and Logit. The default is Raking. Click here to learn more about how to use trimming and the different algorithms.
By default the controls in this section are greyed out and disabled. To enable them, either select two or more adjustment variables, which enables rim weighting/raking, or select a design weight variable.
Saving a weight variable
Once you have an appropriate weight specified, on the bottom-right the button becomes active. Pressing this button takes you back to the main interface and you see a new weight variable in the data set.
Using weights created outside of Q
In addition to building weights in Q, you can use weights calculated outside of Q. See: How to Make Variables Available as Weights for more information.
To apply a variable as a weight:
- Select the output(s) you want to weight on the Outputs tab
- Then choose the appropriate weight variable from the Weight drop-down box at the bottom of the screen.
As with other Q-constructed variables, the weight variable will be automatically updated if you import a new data file. If any of the target categories change in the new data file or new categories are added, the variable's status will become INVALID until the weight is edited and any issues fixed.
Troubleshooting errors in weights
When creating a weight it is possible for there to be a mismatch between the targets that have been specified and the actual sample sizes in the data set which prevents the weight from being calculated. This section describes a common situation when a mismatch occurs and the steps that can be taken to allow the weight to be calculated.
Problems with weights often occur when the samples in the categories used by the weight are small or empty. Errors that are encountered when constructing a weight can indicate that there are problems with the sample, or that the weight scheme is too complicated. The general approach to solving problems with weighting is to simplify the weighting scheme by either reducing the number of questions that are being used, or by consolidating categories within the input questions.
For example, let's say you attempt to create a weight with the following variable. Notice that the first category has zero cases.
Now, suppose you attempt to create the following weight which includes the empty categories:
Notice the error message at the bottom along with a suggestion about how to correct it, which in this case is to set the targets for the empty categories to 0.
Another solution is to merge the empty categories with other categories that have sample. Merging is done by dragging-and-dropping categories in the table where weight targets are entered.
Rim weighting does not converge
When the weighting involves multiple adjustment variables Q uses an algorithm called Rim Weighting or raking to estimate the weight for each respondent so that each of the different sets of targets is achieved. Because each set of targets is independent of the others, the situation can arise where one set of weight targets contradicts one of the other sets of targets by calculating two different sample sizes for the same group of respondents. When a contradiction like this occurs, it is logically impossible to calculate an appropriate weight.
Diagnosing and Solving the Problem
If you have used more than two adjustment variables then the first step to solving the problem is to identify which pair of questions is in conflict. It is possible that three or more questions can be in conflict with one another. The following process will allow you to identify which adjustment variables are contributing to the conflict:
- Use Insert Variables > Weight to create your weight, include as many weight sets in the weight as possible without getting an error in the Diagnostics Report at the lower right.
- Save the variable by clicking the New Weight at the lower right.
- For each remaining adjustment variable:
- Right-click the weight variable in the Variables and Questions tab and select Edit Weight.
- Complete the weight, ensuring that all targets are entered for all adjustment variables.
- Remove the current adjustment variable.
- Check the Diagnostics Report at the bottom right. If an error is still generated then you know that the adjustment variable that you removed is not the source of the conflict in the weight. If an error is not generated then the adjustment variable that has been removed is contributing to the conflict.
- Repeat these steps as needed to identify the source of the conflict.
The final step is to try and identify the reason why the questions that have been identified above are in conflict with the weight targets. There is no general solution, and the problem can be tricky to identify. Some trial-and-error is required. One approach is to:
- Create a cross-tab between each pair of questions identified above, and show the n in the Statistics - Cells.
- Examine the sample sizes in the tables and try to identify those which are particularly small (e.g. n = 1 or n = 2) as these are most likely to present problems.
- Create your weight again.
- In the Edit weight window, merge the category with the small sample size with another category, enter a target for the combined category and then check the Diagnostics Report. If the same error still occurs then a different combination of categories is required, and the process should be repeated. The choice of category to merge will depend both on what makes the most sense from the point of view, and which combination of categories solves the issue.