Introduction
Knowing the type of data you are working with in R is useful because certain functions require specific data types for inputs/outputs. For example, you can't perform mathematic operations on numbers that have a character data type. Below is a list of the various data types and examples of what you can do with each.
Note, a few functions used throughout: head()
- shows just the top part of the data, cbind()
- combines data into columns, rbind()
- combines data by rows.
Method
1. Logical
A logical data type will always return TRUE or FALSE. This can be created by a condition (logical test) and some functions. In Q, logical variables can also be used as binary filters.
The below are examples that return a logical result:
When the final result is a list of T/F results, Displayr shows them as Xs (false) and check marks (true).
You can use logical results to create a series of if .... else statements.
You can use conditions to subset data in R. When you put a condition inside square brackets (as described in How to Work with Data in R), the TRUE/FALSE (T/F) results are used to select the data that is TRUE for the condition. In the example below on line 2, we create a vector of data (called a
) equal to: 1, 2, 3. On line 4, the a > 2
returns F, F, T inside the brackets to select the number 3 in a
. The rest of line 4 changes that number 3 to a 10, which is then displayed in the final result.
2. Numeric
A numeric data type is a number that is not encompassed in "" when Show raw R output is checked:
Here, the first 3 numbers are numeric, while the last 3 are text. Note, numeric formulas will only work on data with the correct data type. In Q, only numeric and binary data will allow you to adjust decimal places.
3. Character
A character is text or string. This will always be encompassed in "" when Show raw R output is checked. When Q sees "z", it will view it as text. However, writing z in your code will lead Q to believe you are referencing an R object called z.
4. Date
A date can be either a Date or Date/Time object.
as.Date("2021-05-20") # Date
as.POSIXct("2021-05-20") # Date/time
Date/Time variables in Q are POSIXct so they can store a date or date and time. R will pull in the raw dates, but you can view aggregated dates by using the following code:
attr(yourdate,"QDate")
5. Factor
A factor in R is equivalent to a category in data. This is equivalent to Nominal or Ordinal variables in Q. A factor contains both a value and a label. These are called levels which can be referenced using levels(x)
. Levels can also be viewed in a raw output:
The below examples produce the data as labels and values separately:
as.character(Gender) # label
as.numeric(Gender) # value
Note, if your data is ordinal, it will appear as an ordered factor.
Particulars of using factors in your code:
- When using factors in code you can treat them as character variables:
- However, when using factors in certain functions like cbind and rbind, you'll need to convert them to character first:
- Sorting and ordering will use the sequential levels not the labels:
- To access the underlying coded values, use the attr() function:
- Certain mathematical functions like max can only be run using Ordinal variables because you can't take a max of unordered categories (like male, female).
Get a copy of the examples above in your account by clicking HERE.
See Also
How R Works Differently in Q Compared to Other Programs
How to Reference Different Items in Your Project in R
How to Work with Conditional R Formulas
How to Extract Information from an R Object
How to Extract Data from a Multiple Column Table with Nested Data