## Introduction

Binary variables are variables which only take two values. For example, Male or Female, True or False and Yes or No. While many variables and questions are naturally binary, it is often useful to construct binary variables from other types of data. For example, turning age into two groups: less than 35 and 35 or more. Constructing binary variables is also known as *quantizing* and *dichotomizing*.

## Requirements

Variables or values you want to turn into two groups.

## Methods

Any of the ways for creating numeric variables can be used to create binary variables. However, the main approaches are:

### Creating a Binary - Complicated filter

- Right-click on the
**V****ariables and Questions**tab - Select
**Insert Variable(s)****>****Binary - Complicated Filter**.

### Creating filters

- Any Filters that are created (from a cell in a table, for example) are binary variables. See: How to Create Filters

### Creating a binary question by changing Question Type

- When you use Set Question to create a Pick Any or Pick Any – Grid question, each of the variables in the question become binary. See: How to Combine Variables into a Single Question

### Recoding

- Binary variables can be constructed by editing the
**Values**in the Value Attributes so that they all take only two values (generally, 0 and 1 are most appropriate).

- Banner questions are intrinsically binary (i.e., they have the Question Type of Pick Any). See: How to Create a Banner

### Custom calculated variables using JavaScript

*Simple syntax*

The following code computes a value of 1 if the observation has a value of 1 for any of the three input variables (d1, d2 and d3), otherwise returns a 0:

```
if (d1 == 1 || d2 == 1 || d3 == 1) 1;
else 0
```

This code computes a value of 1 if the observation has given a value of 1 for all of the input variables (d1, d2 and d3), otherwise returns a 0:

```
if (d1 == 1 && d2 == 1 && d3 == 1) 1;
else 0
```

Note here that && is used to denote the "AND" condition.

*Shorter syntax*

Those new to JavaScript may be a little surprised by these expressions, as they often think it would be better if the symbols were not repeated (e.g., writing `d1 = 1 & d2 = 1` instead of `d1 == 1 && d2 == 1`). There are good-but-technical reasons why JavaScript does not work this way (e.g., `=` is used to create variables). However, there are a number of ways to make the code much shorter if that is desired. There is no need for an *if statement* as any logical expression in JavaScript is automatically evaluated as a 1 or 0, and thus we can write:

`d1 == 1 && d2 == 1 && d3 == 1`

If the statement is true, a 1 will be returned and if false a 0 will be returned.

Additionally, if we know that the input variables are themselves binary, only taking values of 1 and 0 (and with no missing values), we could write:

`d1 && d2 && d3`

or

`d1 * d2 * d3`

*Automation*

If there is a need to create many related binary variables using JavaScript, this can be done using either QScript or the **Use as a Template for Replication** script.

## Missing values in binary variables

In theory, binary variables should only have two values. In practice, it is often useful if they can also have missing values, in which case the Binary - Complicated Filter and the methods based around creating Filters tend not to be useful.

## Other Notes on Binary Variables

To most people, averages and percentages are quite different concepts, with averages applying to numeric data (e.g., number of pizzas eaten in a week) and percentages relating to categories (favorite brand of pizza). From a computational perspective, averages and proportions are very closely related and this interrelationship can be exploited using Q to save time (if you used SPSS, it is likely that you already understand the basic principles that are demonstrated in this section; if not, it may seem a bit strange at first).

If you construct binary variables by recoding or constructing numeric values to only take values of 0, 1 and NaN, any computed averages will also be proportions. If, for example, you have a sample with 56% males, and you recode the gender variable so that males have a value of 1 and females 0, and convert its Question Type to either Number or Number – Multi, the average will be 0.56. The main benefit of binary variable “maths” is that while a variable that Q knows is binary will always have a NET, a numeric variable instead has a SUM. If, for example, the question was measuring brands that the consumer would consider buying, the SUM would then measure the consideration set size (whereas with a traditional binary variable, the NET would indicate the proportion of the sample to consider 1 or more brands).

Why does this work? It is because any binary variable, by definition, is implicitly also a Numeric Variable. That is, Numeric Variables are variables that can take any value, and binary variables take values of 0 and 1 and are thus Numeric Variables.

## Next

How To Create New Variables With Multiple Categories