This article describes useful methods for troubleshooting custom R code.
- An R variable, R output or data set.
- Basics on how to use R in Q, see How to Learn R.
- R is case-sensitive so make sure you have matched the case in your referencing.
- If adding data using R or creating an R variable, try the code out in an R Output first to make sure it returns what you expect. Then copy it into your R variable.
- Use # in your R code to add comments so it's easy to see what each section of code does. This will help you remember the intent of the code and will help Q Support team and others in your organization understand the intent if any collaboration is needed:
#this is a comment about the following code
x = 1 #this is also a comment about this line of code
- In many instances it's useful to simply look at some of your variables/tables in the code to spot unexpected inputs or results. To do so, hover over the highlighted name in your code. For instance the code below won't return the number of people who prefer Diet Coke (coded as 1 in the data) because the data in R is the value label and not the underlying numeric value:
- When working with R Outputs, check Properties > OUTPUT > Show raw R output to see which exact line in your code is causing the error in your code. For example, what line of the code is actually returning the error below?
After checking Show raw R output, you'll see that the 4th line is returning the error because we have the incorrect syntax within the brackets (the [,1] is not needed in this instance because totalcoke is only one column).
- Use Show raw R output, to check variables and results as you are in the process of writing code to be sure your code is doing what you expect. For example, the code below calculates a ratio of coke drank per week vs the max response, but the result is showing up as missing.
By adding in a line to view what the max is, you'll see it is coming up as NA because we have forgotten to add in na.rm=T to our max() function.
- Many errors and warnings are pretty generic and abstract sounding, but you can google them to see examples from other people who are receiving this error to tip you off to what may be the issue with your code. For example the "incorrect number of dimensions" error below:
gives you an idea when you google it that it doesn't fit with the structure:
2. Helpful Functions
There are a number of functions which can be used to help work out where an issue in your code exists.
In this example I have defined my table as x in my code and have ticked Show raw R output:
x = table.Preferred.cola
- head() - a quick way to inspect your table/variable in your code to make sure it's setup as you expect.
head(x)will display the top 6 rows by default and
head(x,10)will display the top 10.
- str() - describes the data and structure of the variable/calculation.
str(x)will return the structure of your data, including names, statistics and data types.
- class() - tells you what R class property your table/variable belongs to.
class(x)in this example will display matrix and array.
- length() or dim() - tells you the size of your variable.
length(x)in this case will return 18 (the total number of table cells) and
dim(x)will return 9 and 2 to represent the number of rows and columns respectively.
- setdiff() - shows you items in one list that are not in the other.
setdiff(rownames(x),c("Coca-Cola","Diet Coke","Coke Zero"))will compare the table row labels to these three Coke brands and return only the Pepsi and 'unengaged' categories.
- sum(condition) - see how many times a condition holds true using sum to make sure your condition/logical test is setup correctly.
sum(x[,"n"] > 20)will return a true if the sum of the n column is over 20.
- any(condition) or all(condition) - check to see if any or all of the items you're testing come back true.
any(x[,"n"] < 20)will tell you if any of your n rows are less than 20, and
all(x[,"n"] > 0)will tell you if all your n rows are greater than 0.
The functions above result in the following the raw R output when added to code:
3. Common Errors
Incorrect number of dimensions / subscript out of bounds
The below code will produce an incorrect number of dimensions error due to the last line:
x = table.Gender
#return Female value
If we add
dim(x) to the code, we will see that it returns only a 3, that is 3 rows. There is not a second value because there is no additional column dimension. The solution therefore is to change the last line to
Not equal to array extent / number of items to replace is not a multiple of replacement length
The below code will produce a length of 'dimnames'  not equal to array extent error due to the last line:
x = table.Living.arrangements
#new row names
newrows = c("Parents","Alone","Partner","Children","Partner + Children","Sharing","Other")
rownames(x) = newrows
If we add
length(newrows) to the code, we will see that one returns 8 and the other 7. We can only replace the row names with the exact same number of items. The solution is to either add "NET" to the end of the newrows object or remove it from the source table via right-click > Hide.
Missing / duplicated rows or columns
The below code attempts to merge these two tables together but doesn't return the correct result:
x = table.Preferred.cola.2
y = table.Q4.Brand.Attitude.T2B
If we add
head(y), we can view the figures as they should align. However, this doesn't solve the immediate problem. We can also compare the row names by using the below to see what labels don't match:
What we can see here is that the Pepsi label has a trailing space in one table which causes the merged table not to align properly. The solution is to right-click > Rename this label in the preference table and remove the space.