This article outlines the various methods of accessing and using R in Q. Analyses can be conducted in Q using the R Language and while we have attempted to make it feel like Q and R are one-and-the-same, in reality, they are completely different programs that "talk" to each other, see How R Works Differently in Q Compared to Other Programs.
Creating and Modifying
There are a number of ways to access R from within Q:
- Entering R code directly into an R Output.
- Creating an R Variable.
- Creating a new Data Set using R.
- Accessing pre-written R code using menus and forms. This is how most advanced analyses are conducted in Q (e.g., regression, principal components analysis). This is referred to as Standard R.
- Including R code in QScripts. QScript is Q's automation language.
Automatic updating - Any R code that is created via R Outputs, R Variables, Standard R, or QScripts is set to automatically update when the inputs change (e.g., if the input data changes, if a new data file is created, a variable is recoded or changed, or if other options are changed).
1. R Outputs
An R Output is an output in a Project, created as follows:
- In the Report tree, right click and select Add R Output.
- In the Object Inspector, enter code written in the R Language under Properties > R CODE. This code can be used to create tables, visualizations or any other R Outputs.
- Press the Calculate button. This sends the instructions over a secure internet connection to a computer in the cloud. The result is sent back to Q, and shown on your screen (i.e., the result is the R Output). The output will typically be a table, chart, text string, or an error message.
- If you would like to see the terminal view of the R CODE you are running, check the Properties > OUTPUT > Show raw R output checkbox. This can be very useful when troubleshooting your code, see How to Troubleshoot R Code.
See How to Reference Different Items in Your Project in R for more detail as to how to use other data in your project in R. You can also Hide R Outputs to not export them to the final Report.
In the example below, a histogram is created of 9 numbers. (If you are not familiar with the R Language, refer to Learning the R language.)
When multiple R Outputs are selected, a table displaying the status of the selected R Items will be shown:
R Items that require updating will be greyed out. There are two buttons above the table:
- Update all these items updates all the selected R Items, regardless of whether they require updating.
- Update grey items updates only the selected R Items that are greyed out.
2. R Variables
An R Variable is a constructed variable in the dataset. You can think of it as inserting a column(s) of calculations like in Excel. You can create an R Variable as follows:
- On the Variables and Questions tab, right click on a variable and select Insert Variable(s) > R Variable.
- Under Properties > R CODE, enter the R code written the R Language.
- Click the play button (or press F5) to run the code.
- Preview your results on the left pane shaded in the blue columns with input data used shaded in grey. If you provide variable or column names, these will be the Labels for the variables when they are created.
- Set the Variable Base Name. Where your code only creates a single variable, this will be the Name of that variable. Otherwise, the new variable names will be whatever you enter here, followed by an underscore and a number (e.g., dog_2).
- Set Question Name at the bottom of the window and click Add R Variable.
To troubleshoot any errors that pop-up when creating a variable, you can copy and run the code from an R Output in your Report to be able to access the Raw R Output and troubleshoot your code.
3. R Data Sets
An R Data Set can be added to a project by doing the following:
- File > Data Sets > Add to Project > From R.
- Pasting in R code to read in and reformat (if necessary) your data. See R Data Sets for more information.
- Click the play button (or press F5) to run the code and preview your dataset on the left pane.
- Give the data set a Name.
- Click Add Dataset.
To troubleshoot any errors that pop-up when creating a variable, you can copy and run the code from an R Output in your Report to be able to access the Raw R Output and troubleshoot your code.
4. Standard R
It is possible to do just about any form of data analysis using R by writing code. Where we think analyses are likely to be used by many of our users, we have made it available via a graphical user interface (i.e., menus and/or buttons and the like, without needing to write code). We refer to the analyses that we have made available via a graphical user interface as Standard R. The R Logo (i.e., ) is used to mark menu items that use Standard R. See Standard R for more information about how Standard R items work and are created.
Note, that R code in a Standard R object can be further edited via Properties > R CODE. For example, the following pie chart was created by clicking Visualization > Pie > Pie, but you can view and modify the R code that creates it in Properties > R CODE. Note that Standard R references fields on the Inputs and Chart tab, so if you overwrite those settings in the code those fields will not be taken into account.
Updating
R code is automatically re-run whenever:
- Data or outputs that are inputs into R calculations are changed (unless Properties > R CODE > Automatic is un-checked in the Object Inspector). Examples include: a question used was recoded or the Data Set is updated.
- The R code contains instructions for updating at a specified frequency using per How to Automatically Update Calculations, Variables and Data Sets Using R or you insert Anything > Report > Automatic Updating per How to Automatically Update a Page.
R code is a part of a document's dependency graph of modifications/calculations, which can impacts speed on large documents.
Helpful highlighting
When you create R code in Q the code is colored and highlighted in a way to make it easier to see what items are being referenced.
- Tables and other R Outputs will highlight in blue.
- Variables will highlight in yellow.
Hovering over the name of an input will show you a preview of the data. When referencing other variables, any categories that have been merged in the related table will flow through to R, and any that have been removed will be excluded. By clicking the name in the preview, Q will then take you to the object, for example to a Page where a table exists or to the Data Sets tree where the variable exists.
- Strings will be colored red.
- Some functions and other constructs will be colored blue. You can also hover over an R function to see the relevant Help documentation.
- Comments - which are used to help document code and are not processed by R - are colored green.
Tips for writing R code
- R is case-sensitive so make sure you have matched the case in your referencing.
- If adding data using R or creating an R variable, try the code out in an R Output first to make sure it returns what you expect. Then copy it into your R variable.
- Use # in your R code to add comments so it's easy to see what each section of code does. This will help you remember the intent of the code and will help Q Support team and others in your organization understand the intent if any collaboration is needed:
#this is a comment about the following code
x = 1 #this is also a comment about this line of code - In many instances it's useful to simply look at some of your variables/tables in the code to spot unexpected inputs or results. To do so, hover over the highlighted name in your code. For instance the code below won't return the number of people who prefer Diet Coke (coded as 1 in the data) because the data in R is the value label and not the underlying numeric value:
- When working with R Outputs, check Properties > OUTPUT > Show raw R output to see which exact line in your code is causing the error in your code. For example, what line of the code is actually returning the error below?
After checking Show raw R output, you'll see that the 4th line is returning the error because we have the incorrect syntax within the brackets (the [,1] is not needed in this instance because totalcoke is only one column). - Use Show raw R output, to check variables and results as you are in the process of writing code to be sure your code is doing what you expect. For example, the code below calculates a ratio of coke drank per week vs the max response, but the result is showing up as missing.
By adding in a line to view what the max is, you'll see it is coming up as NA because we have forgotten to add in na.rm=T to our max() function. - Many errors and warnings are pretty generic and abstract sounding, but you can google them to see examples from other people who are receiving this error to tip you off to what may be the issue with your code. For example the "incorrect number of dimensions" error below:
gives you an idea when you google it that it doesn't fit with the structure:
R Limitations
- When referencing default tables, R can only pull in statistics from Inputs > STATISTICS > Cells. The Right and Below statistics are not recognized, but you can add the equivalent as Cells statistics in order to reference these.
- When referencing default tables, R is unable to bring in significance results from the Arrows and Font colors setting. For a workaround for this see How to Add Statistical Significance to a Table Using the Formattable R Package and How to Add Statistical Significance to AutoFit Tables.
- When referencing default tables that include nested span labels, such as banner tables, R will by default only bring in the bottom-most labels. See How to Work with Nested Banners and Spans in R Tables for more information.
- When importing R data sets, R is unable to bring in variable labels as R does not have the concept of both Name and Label. This means each variable will be labeled using the variable name.
- R Outputs and R data sets are by default restricted to 67MB.
-
R code is allowed to create temporary files, though they will be automatically deleted when it finishes. Temporary files cannot exceed 500MB.
- There are certain keywords used by R that you should not use as names for your variables, calculations, and data sets. Using one may conflict with some of our automations and cause unintended results. These names include: list, length, exists, assign, names, rep, any, stop, min, max, matrix, attr, sum.
R vs QScript
QScript is Q's macro language, which is used for automation. Many of the menu items in Q are written in QScript. Users can write their own automations using QScript. The key distinctions between QScript and R are:
- QScript can be used for manipulating the user interface (e.g., creating dialog boxes). R cannot.
- QScript can be used for automatically both creating and modifying charts, tables, variables, and questions. By contrast, if you wish to create an R Variable, R Output or, R Data Set you need to either manually create it from the menus, or, create it via QScript. For an example, see Regression - Diagnostic - Prediction-Accuracy Table. Note that within R Outputs you still have all the R functions for creating R data types, such as variables, vectors, and data frames. The distinction being discussed here relates to the ability to control data as shown in the Variables and Questions tab (i.e., a Data Set).
- QScript is generally faster than R (e.g., it is better to create lots of variables in QScript than R).
- It is much easier and faster for users to write R code than QScript. R is specifically designed for data analysis, whereas JavaScript, which is the language that QScript is written for, is designed to be used for many, many, different applications, and the consequence of this is that it can be quite unwieldy for data analysis (i.e., to use JavaScript you need more advanced coding skills and will generally need to write many more lines of code than if trying to achieve the same thing in R).
Next
How to Use Different Types of Data in R
How to Reference Different Items in Your Project in R
How to Work with Conditional R Formulas
How to Add a Custom R Output to your Report
How to Create a Custom R Variable