How to approach using R to add a data set
When writing the R code for an R Data Set you are not able to refer to other outputs or variables that exist in your Q Project. This is different to R Outputs and R Variables, where the underlying technology allows you to refer to variables and outputs throughout your project.
As a result, the R code that you use to create a data set needs to both bring in a source of data and process it into the rows and columns that you want to appear as your cases and variables in Q.
The data source can be:
- A file on your local machine. You can use functions like read.spss() or read.csv(). When referring to the location of the file, you must use double-backslashes (\\) or single forward slashes (/) when describing the file path. An example of this approach is
location = "C:\\Users\\Chris\\Desktop\\Cola Tracking - January to September 2017.sav" library(foreign) datafile = read.spss(location, use.value.labels = FALSE, to.data.frame = TRUE)
- A file located at a URL (again using read.csv or similar).
location = "https://wiki.q-researchsoftware.com/images/3/35/Technology_2018.sav" library(foreign) datafile = read.spss(location, use.value.labels = FALSE, to.data.frame = TRUE)
- Data obtained using an API of some kind (e.g. Google Analytics, Twitter, or just about anything else enabled by R).
How to use R to add data sets
- File > Data Sets > Add to Project > From R.
- Enter code for creating a Data Set into the R CODE box. In the example below, an SPSS data file is being read in using the R function read.spss from the foreign package.
- Press F5. This will generate a preview of the Data Set's contents.
- Enter a Name (at the bottom of the screen).
- Press Add Data Set. Q will then take you through the normal process for setting up data, as if you had imported a data file.
When to use R Data Sets
In general, R Data Sets should not be used for importing data files that can be imported into Q via File > Data Sets > Add to Project > From File, as doing so will bypass a host of tools designed to check and correct problems in data.
The main use cases for R Data Sets are:
- Importing unusual types of data (e.g., web-scraping).
- Manipulating data prior to creating a data file (e.g., merging lots of data files).
- Having data sets that automatically update in your project.