Binary variables are variables which only take two values. For example, Male or Female, True or False and Yes or No. While many variables and questions are naturally binary, it is often useful to construct binary variables from other types of data. For example, turning age into two groups: less than 35 and 35 or more. Constructing binary variables is also known as quantizing and dichotomizing.
Variables or values you want to turn into two groups.
Any of the ways for creating numeric variables can be used to create binary variables. However, the main approaches are:
Creating a Binary - Complicated filter
- Right-click on the Variables and Questions tab
- Select Insert Variable(s) > Binary - Complicated Filter.
- Any Filters that are created (from a cell in a table, for example) are binary variables. See: How to Create Filters
Creating a binary question by changing Question Type
- When you use Set Question to create a Pick Any or Pick Any – Grid question, each of the variables in the question become binary. See: How to Combine Variables into a Single Question
- Binary variables can be constructed by editing the Values in the Value Attributes so that they all take only two values (generally, 0 and 1 are most appropriate).
- Banner questions are intrinsically binary (i.e., they have the Question Type of Pick Any). See: How to Create a Banner
The following code computes a value of 1 if the observation has a value of 1 for any of the three input variables (d1, d2 and d3), otherwise returns a 0:
if (d1 == 1 || d2 == 1 || d3 == 1) 1; else 0
This code computes a value of 1 if the observation has given a value of 1 for all of the input variables (d1, d2 and d3), otherwise returns a 0:
if (d1 == 1 && d2 == 1 && d3 == 1) 1; else 0
Note here that && is used to denote the "AND" condition.
d1 == 1 && d2 == 1 && d3 == 1
If the statement is true, a 1 will be returned and if false a 0 will be returned.
Additionally, if we know that the input variables are themselves binary, only taking values of 1 and 0 (and with no missing values), we could write:
d1 && d2 && d3
d1 * d2 * d3
Missing values in binary variables
In theory, binary variables should only have two values. In practice, it is often useful if they can also have missing values, in which case the Binary - Complicated Filter and the methods based around creating Filters tend not to be useful.
Other Notes on Binary Variables
To most people, averages and percentages are quite different concepts, with averages applying to numeric data (e.g., number of pizzas eaten in a week) and percentages relating to categories (favorite brand of pizza). From a computational perspective, averages and proportions are very closely related and this interrelationship can be exploited using Q to save time (if you used SPSS, it is likely that you already understand the basic principles that are demonstrated in this section; if not, it may seem a bit strange at first).
If you construct binary variables by recoding or constructing numeric values to only take values of 0, 1 and NaN, any computed averages will also be proportions. If, for example, you have a sample with 56% males, and you recode the gender variable so that males have a value of 1 and females 0, and convert its Question Type to either Number or Number – Multi, the average will be 0.56. The main benefit of binary variable “maths” is that while a variable that Q knows is binary will always have a NET, a numeric variable instead has a SUM. If, for example, the question was measuring brands that the consumer would consider buying, the SUM would then measure the consideration set size (whereas with a traditional binary variable, the NET would indicate the proportion of the sample to consider 1 or more brands).
Why does this work? It is because any binary variable, by definition, is implicitly also a Numeric Variable. That is, Numeric Variables are variables that can take any value, and binary variables take values of 0 and 1 and are thus Numeric Variables.