How to Create Binary Variables

Binary variables are variables which only take two values. For example, Male or Female, True or False and Yes or No. While many variables and questions are naturally binary, it is often useful to construct binary variables from other types of data. For example, turning age into two groups: less than 35 and 35 or more. Constructing binary variables is also known as quantizing and dichotomizing.

Requirements

Variables or values you want to turn into two groups.

Methods

Any of the ways for creating numeric variables can be used to create binary variables. However, the main approaches are:

Creating a Binary - Complicated filter

Right-click on the Variables and Questions tab
Select Insert Variable(s) > Binary - Complicated Filter.

To learn more about how the Binary - Complicated Filter creator works, see: Binary Variable

Creating filters

Any Filters that are created (from a cell in a table, for example) are binary variables. See: How to Create Filters

Creating a binary question by changing Question Type

When you use Set Question to create a Pick Any or Pick Any – Grid question, each of the variables in the question become binary. See: How to Combine Variables into a Single Question

Recoding

Binary variables can be constructed by editing the Values in the Value Attributes so that they all take only two values (generally, 0 and 1 are most appropriate).

Creating banner questions

Banner questions are intrinsically binary (i.e., they have the Question Type of Pick Any). See: How to Create a Banner

Custom calculated variables using JavaScript

Simple syntax

The following code computes a value of 1 if the observation has a value of 1 for any of the three input variables (d1, d2 and d3), otherwise returns a 0:

if (d1 == 1 || d2 == 1 || d3 == 1) 1;
else 0

This code computes a value of 1 if the observation has given a value of 1 for all of the input variables (d1, d2 and d3), otherwise returns a 0:

if (d1 == 1 && d2 == 1 && d3 == 1) 1;
else 0

Note here that && is used to denote the "AND" condition.

Shorter syntax

Those new to JavaScript may be a little surprised by these expressions, as they often think it would be better if the symbols were not repeated (e.g., writing d1 = 1 & d2 = 1 instead of d1 == 1 && d2 == 1). There are good-but-technical reasons why JavaScript does not work this way (e.g., = is used to create variables). However, there are a number of ways to make the code much shorter if that is desired. There is no need for an if statement as any logical expression in JavaScript is automatically evaluated as a 1 or 0, and thus we can write:

d1 == 1 && d2 == 1 && d3 == 1

If the statement is true, a 1 will be returned and if false a 0 will be returned.

Additionally, if we know that the input variables are themselves binary, only taking values of 1 and 0 (and with no missing values), we could write:

d1 && d2 && d3

d1 * d2  * d3

Automation

If there is a need to create many related binary variables using JavaScript, this can be done using either QScript or the Use as a Template for Replication script.

Missing values in binary variables

In theory, binary variables should only have two values. In practice, it is often useful if they can also have missing values, in which case the Binary - Complicated Filter and the methods based around creating Filters tend not to be useful.

Other Notes on Binary Variables

To most people, averages and percentages are quite different concepts, with averages applying to numeric data (e.g., number of pizzas eaten in a week) and percentages relating to categories (favorite brand of pizza). From a computational perspective, averages and proportions are very closely related and this interrelationship can be exploited using Q to save time (if you used SPSS, it is likely that you already understand the basic principles that are demonstrated in this section; if not, it may seem a bit strange at first).

If you construct binary variables by recoding or constructing numeric values to only take values of 0, 1 and NaN, any computed averages will also be proportions. If, for example, you have a sample with 56% males, and you recode the gender variable so that males have a value of 1 and females 0, and convert its Question Type to either Number or Number – Multi, the average will be 0.56. The main benefit of binary variable “maths” is that while a variable that Q knows is binary will always have a NET, a numeric variable instead has a SUM. If, for example, the question was measuring brands that the consumer would consider buying, the SUM would then measure the consideration set size (whereas with a traditional binary variable, the NET would indicate the proportion of the sample to consider 1 or more brands).

Why does this work? It is because any binary variable, by definition, is implicitly also a Numeric Variable. That is, Numeric Variables are variables that can take any value, and binary variables take values of 0 and 1 and are thus Numeric Variables.

How to Create New Variables

How To Create New Variables With Multiple Categories

Articles in this section

Requirements

Methods

Creating a Binary - Complicated filter

Creating filters

Creating a binary question by changing Question Type

Recoding

Creating banner questions

Custom calculated variables using JavaScript

Simple syntax

Shorter syntax

Automation

Missing values in binary variables

Other Notes on Binary Variables

Next

Articles in this section

Requirements

Methods

Creating a Binary - Complicated filter

Creating filters

Creating a binary question by changing Question Type

Recoding

Creating banner questions

Custom calculated variables using JavaScript

Simple syntax

Shorter syntax

Automation

Missing values in binary variables

Other Notes on Binary Variables

Next

Related articles