When analyzing data, it often becomes necessary to create new variables based on a logical expression, or on one or more existing variables in your file. For example, you might want to create a new variable called AgeGroup based on a variable already in your file called Age.
This article discusses the most common ways to add new variables to your file:
Duplicating and modifying variables and questions
Binary - Complicated variables
QScripts in the Online Library
Creating variables by mathematical expression
Creating new variables by writing your own QScript
Adding variables created in other programs
Modifying existing data
Modifying the Question Type of data (see How to Setup and Change Question Types). In Q it is possible to achieve many results that are ordinarily computed by creating variables in other programs by:
- Changing Value Attributes.
- Merging categories on tables.
- Manually edited on the Data tab. For example, individual values can be changed and observations can be deleted. This does not cause the raw data to change. However, any changes made on the Data tab flow through to all analyses that use the data.
For example, this is usually the fastest way to compute things like:
- Top 2 Box Scores
- Shares from numeric data (see How to Compute Share of Wallet, Spend, Mouth, etc. (i.e., Volumetric Analysis))
- NPS Scores
- NETs
Duplicating and modifying variables and questions
Whenever you make a modification to a question's Question Type, row/column order, labels, category nets and merges, and other things that are not done through a Rule on a table, these modifications will carry through to all outputs using the question. This is handy because you only need to "fix" your question once and it will be shown as you like it in all outputs. However, sometimes you may want to show the question data in different ways within the same report. In this situation:
- Duplicate the existing variable or set of variables (see How to Copy and Paste Variables and Questions).
- Modify the duplicated data.
Binary - Complicated variables
When a Filter is created, Q automatically creates a binary variable. This is usually the best way to construct variables requiring boolean logic. See How to Create Filters and How to Create Binary Variables.
Ready-Made Formula(s)
The Ready-Made Formula(s) Menu contains various automated routines for:
- Computing the most common mathematical functions: Sum, Minimum, Maximum, Average, Standard Deviation, Variance, Count, Any of and None of. Select the data in the Variables and Questions tab and select Insert Ready-Made Formula(s) > Mathematical Functions (by Case).
- Automatically combining different questions, such as merging them and creating new questions with all combinations of the input questions.
- Restructuring data (e.g., converting multiple binary variables into a categorical variable).
Text Coding
Text Coding or categorization is often the most efficient way to create new variables where the variables need to reflect a categorization of a large number of categories. It is principally designed for text data. See: Text Analysis
Logic variables
Logic Variables are a relatively simple type of variable used when creating custom categories in a Banner, such as segments, life stage, or other categories that are constructed using one or more inputs.
QScripts in the Online Library
Many standard problems can be solved using Automate > Online Library and searching for specific QScripts. For example: And, they can be created using the Online Library (Tools > QScripts: Online Library and locate the script in the menu):
- Create New Variables - Top 2 Category Variable(s) (Top 2 Boxes)
- Create New Variables - Recode Net Promoter Score (NPS) Variable(s)
- Create New Variables - Numeric Variable(s) from Code/Category Midpoints
- Filtering - Filters from Selected Data
Creating variables by mathematical expression
There are three ways of creating custom calculated variables from scratch (i.e., by typing an expression, such as q1 + q5):
1. A JavaScript variable, see: How to Create Text Variables Using JavaScript and How to Work with Conditional JavaScript Formulas - this is preferable to doing simple calculations and recoding of variables
2. An R variable, see: How to Create a Custom R Variable - this is preferable when needing more complex calculations and calculations aggregate data across respondents
3. Less flexible, is to use Excel-Style Formulas, see: How to Create Excel-Style Formulas
Creating new variables by writing your own QScript
You can also do the entire process of creating variables by writing code (i.e., QScript). Please keep in mind that while in many products this is the smart way of working, in Q this approach is only best practice when you are wanting to automate the process. For example, if you need to automatically create 1,000 new variables, then this approach may make sense, although there will often be faster and easier options.
Adding variables created in other programs
See How to Merge Data Files and How to Paste Data (e.g. From Excel)).