It’s tempting to start writing or scripting a questionnaire by thinking of all of the questions you want to ask and then just adding them without considering what the data would look like once it has been collected. Still, it’s generally more helpful to think of what your deliverables are and what your data should look like to easily help you achieve them, and then work backward to your questionnaire. So, before creating your script, it is important to Create an Analysis Plan.
The ways we sometimes program questions in a questionnaire can make it difficult for Q to recognize variables and structure them correctly. In this article, we will not only show you how to script/program your questionnaire but also how the data should be stored once the data is exported from your data collection platform to help Q better structure your data. We will cover some common pitfalls that usually make it difficult for Q to correctly structure your questions.
Before continuing with this article you should have a sound understanding of the different Variable Set Structures & Question Types available in Q.
Let's go...
Single Response Questions (Pick One)
Multi-Response Questions (Pick Any)
Multi-Response Questions (Pick Any - Grid)
Single Response Questions (Pick One-Multi)
Single Response Questions (Pick One)
What does a single-response question typically look like in a questionnaire?
A single-response question is a question that allows the respondent to select only one answer in the questionnaire (it is typically denoted with a single radio button () in the data collection platform). See the example below where we asked respondents how old they are in Question d1 of the questionnaire. Each respondent had to select only one of the 9 Age categories:
How should the data be stored?
The data should be stored as a single column in the data file. If we look at the data for the Age question, we will note that it is stored as a single column of data that contains all the respondents' answers. In the example below, respondent number 1 selected 25-29 (the Label), which relates to category 27 (the Value) in the question above.
What Question Type will Q assign to this question?
If the data is stored correctly, as per the example above, Q will assign the Question Type as a Pick One.
Multi-Response Questions (Pick Any)
What does a multi-response question typically look like in a questionnaire?
A multi-response question is a question that presents respondents with a list of options and permits them to choose multiple answer options from the list (it is typically denoted with a square check box () in the data collection platform). See the example below where we asked respondents which cola brands they had ever heard of in Question 1b of the questionnaire. Each respondent was permitted to select multiple cola brands:
How should the data be stored?
The data should be stored as multiple adjacent columns (also known as Variables), one for each possible answer option in the questionnaire. In the example of q1b, we would, therefore, have 5 adjacent columns (one for each brand the respondent could select).
Here are some key considerations when setting up a multi-response question in your data collection platform. Also, have a look at Checking Multi Response Questions:
- In most situations, multi-response questions should be set up in a binary format. A binary question contains exactly two unique Values and should ideally be denoted with the values 0 and 1.
- If a respondent skips one of the choices, you can also have a blank or missing value.
- When setting up the values and labels for these binary variables, it is important that the same Label is used for all options. In particular, the value label should not contain the name of the option being evaluated. In our q1b example the label should not be Coca-Cola or Diet Coke etc. but rather be 1 = Aware or 0 = Not Aware, 1 = Selected or 0 = Not Selected.
q1b. Which of the following cola brands have you ever heard of?
Values Labels
SYSMIS (MISSING DATA) Option not shown
0 Not Aware
1 Aware
- The data for the same question should be stored in columns adjacent to each other.
- Ensure Variable Names are unique for each Question i.e. q1b and ensure there are no duplicate names across different questions
- The variables that are part of the same question should follow a predictable, simple pattern: q1b_1, q1b_2, q1b_3, etc.
What Question Type will Q assign to this question?
If the data is stored correctly, as per the example above, Q will assign the Question Type as a Pick Any.
Multi-Response Questions (Pick Any - Grid)
What does a multi-response grid question typically look like in a questionnaire?
A multi-response grid question is a question that presents respondents with a list of options, typically in a grid, and permits them to choose multiple answer options for each row in the grid (it is typically denoted with a square check box () in the data collection platform). You can also think of it as a series of multi-response questions.
In the question below, respondents were asked to select which brands they felt best fit with each statement. Each respondent was permitted to select multiple cola brands per statement.
These questions could typically take two shapes in a questionnaire and can either be asked as a grid, as per the example below:
Or it can be asked as a looped question (i.e., the same question gets asked but loops through the different statements, each on its own screen.
How should the data be stored?
The data should be stored as multiple adjacent columns (also known as Variables), one for each combination of answers in the questionnaire. In the example of q5, we would, therefore, have 18 adjacent columns (one for each statement - brand combination the respondent could select).
Here are some other key considerations to keep in mind before exporting your data from your data collection platform. This will help Q correctly store your question as a Pick Any - Grid.
- Each combination of answers should be stored as its own column of data. In the example above the first column is q5a1 which represents feminine - Coke, the next column q5a2 is feminine - Diet Coke, etc.
- Ensure you set up response variables with labels that read “Question Text - Choice Text” i.e. feminine - Coke, feminine - Diet Coke, etc. If possible, arrange these labels from General i.e. Question Text to Specific i.e. Choice Text.
- The data for the same question should be stored in columns adjacent to each other
- Ensure Variable Names are unique for each Question i.e. Q5 and ensure there are no duplicate names across different questions
- The variables that are part of the same question should follow a predictable, simple pattern: Q5a1, Q5a2, Q5b1, Q5b2. Here, the a and b denote the General Question Text i.e. a will be all the feminine answers and b will be the health-conscious answers. The numbers represent the specific Choice Text, where 1 will represent Coca-Cola and 2 represent Diet Coke etc.
- Some survey platforms cut off labels over a certain length. Brief, unique variable labels make it more likely the full label will come through when saving the file
- As per the Pick Any questions, the data should be stored in binary format with 1 (value) representing yes/selected (label) and 0 representing no/not selected. It is important to ensure that the labels are consistent for every variable in the question. So, maintain the labels as yes/no for the whole question.
What Question Type will Q assign to this question?
If the data is stored correctly, as per the example above, Q will assign the Question Type as a Pick Any - Grid:
This article gives some more insight on How to set Value Attributes for a Pick Any and Pick Any-Grid in Q.
Single Response Questions (Pick One-Multi)
What does a Pick One - Multi question typically look like in a questionnaire?
A Pick One - Multi presents respondents with a list of options, typically in a grid, and permits them to choose only a single answer option for each row in the grid (it is typically denoted with a radio button () in the data collection platform). In the example below we asked respondents to tell us how they feel about different Cola brands, each respondent had to provide one answer on the scale (Hate to Love) for each of the brands listed.
These questions could typically take two shapes in a questionnaire and can either be asked as a grid, as per the example below:
Or it can be asked as a looped question (i.e. the same question gets asked but loops through the different statements, each on its own screen.
How should the data be stored?
At their core, Pick One - Multi questions are a series of Pick One questions with the data stored in multiple adjacent columns. Each column represents the category respondents were asked about i.e. in our example above Q4a will be Coca-Cola and Q4b will be Diet Coke etc. Each column shares exactly the same Values and Labels.
Here are some other key considerations to keep in mind before exporting your data from your data collection platform. This will help Q correctly store your question as a Pick One - Multi.
- Each category should be stored in its own column
- Each column should have exactly the same Values and Labels, and they should follow the same order. See the example below:
q4 How do you feel about the following Cola Brands
Coca-Cola
Values Labels
-2 Hate
-1 Dislike
0 Neither like or dislike
1 Like
2 Love
Diet Coke
Values Labels
-2 Hate
-1 Dislike
0 Neither like or dislike
1 Like
2 Love
- The data for the same question should be stored in columns adjacent to each other
- Ensure you set up response variables with labels that read “Question Text - Choice Text” i.e. Brand Attitude - Coke, Brand Attitude - Diet Coke, etc. If possible, arrange these labels from General i.e. Question Text to Specific i.e. Choice Text.
- Ensure variable Names are unique for each Question i.e. Q4 and ensure there are no duplicate names across different questions
- The variables that are part of the same question should follow a predictable, simple pattern: Q4a, Q4b, etc.
- Some survey platforms cut off labels over a certain length. Brief, unique variable labels make it more likely the full label will come through when saving the file.
What Question Type will Q assign to this question?
If the data is stored correctly, as per the example above, Q will assign the Question Type as a Pick One - Multi:
There you go, if you follow these simple rules, you should be on your way to having a well-structured data set when you import it into Q.
Finally, here are some additional tips to keep in mind when scripting your survey
Missing Missing Values
- Mark missing values (skipped/didn’t see the question) with a Blank value.
- Do not mark it with a 0 - reserve 0s to indicate “Seen but didn’t select the choice.” Otherwise, when you pull it into Q, those values will be counted as part of the base, which may or may not be correct, especially if you want the base to be only brand users, for example.
Other Specify answer options
- Include 'Other specify' as part of the answer options, and when a respondent selects it, open up a text box where they can type their answers.
- Including it as a category will help accurately set the base of the question and aid in back coding.
- The text responses should be stored as a separate text variable.
Scale Questions
- Ensure your scale is always coded the "right way" around (even if it is not the order in which the respondent sees the scale) i.e. Love = 5, Hate = 1. Where the highest value represents the highest score to the lowest value represents the lowest score.
- This will ensure when you run transformation scripts in Q such as T2B it will always compute it correctly - How to Compute Top 2 Box Scores
- This also helps if doing advanced analyses and avoids the need for recoding.
- If you can, already script the midpoints of Ages into the values for each category, this will ensure if you compute average ages that it will compute it correctly. The same goes for any questions where you want to calculate averages during analysis. In other words, if a respondent provides their age as an age range of 18-24, if possible, the Value for this answer option should get a code of 21 instead of 1. This will make certain calculations easier during analysis.
Create Hidden/Dummy Variables
- If you can, try and create hidden variables to create reusable/copyable variables that can speed up analysis later and eliminate the need to manually create NETs.
- Say you have asked people where they live in the UK and you provided granular regions but later on you only want to ask a set of questions if people live in Wales in the survey itself.
- Keep a hidden variable that marks a respondent as Welsh, if they said they live in Cardiff.
- These hidden variables are great to have as you do not have to spend time grouping them in Q.
- The same goes for any special groupings that will be important in your analysis. If it’s easy to set up a calculation as a hidden variable that you’re going to use across many surveys, that may save time creating variables during analysis in Q, as it will be pre-loaded with your data set.