How to Merge Tables Using R

Introduction

This article describes how to merge two tables with differing dimensions into a combined R table. We include examples of merging by row and by column.

Requirements

An R Output.
Two tables to merge.

Method 1 - Merge by Rows

In this example, we have the following two tables:

We wish to merge the first table into the second as rows but only keep the common columns:

1. rbind

When you wish to combine only common row categories between tables and your data is in the same format, you can use the rbind function:

Prefer = table.Gender.by.Preferred.Cola
Attitudes = table.Brand.Attitude
rbind(Prefer[,1:6],Attitudes)

This function requires your tables to have the same number of columns and the same column names in the same order. This means we need to filter the preference table to only the first 6 rows. Note, you can merge more than two tables at a time.

2. Rbind

If, however, you wish to include all unique column categories, including Dislike all cola and Don't care, you can instead use our Rbind function from the flipTables package:

flipTables::Rbind(Prefer,Attitudes)

By default, this function will keep all non-matching columns, but you can prevent this by using the keep.all=F argument:

flipTables::Rbind(Prefer,Attitudes, keep.all=F)

This function matches on column name so does not require your tables to have the same number of columns or the same column names in the same order. Note, you can merge more than two tables at a time.

Method 2 - Merge by Columns

In this example, we have the following two tables:

We wish to merge the first table into the second as a column but only keep the common rows:

1. cbind

When you wish to combine only common row categories between tables and your data is in the same format, you can use the cbind function:

Prefer = table.Preferred.Cola
Attitudes = table.Brand.Attitude
cbind(Prefer,Attitudes)

This function requires your tables to have the same number of rows and the same row names in the same order. Note, you can merge more than two tables at a time.

2. Cbind

If, however, you wish to include all unique row categories, including Dislike all cola and Don't care, you can instead use our Cbind function from the flipTables package:

flipTables::Cbind(Prefer,Attitudes)

By default, this function will keep all non-matching rows, but you can prevent this by using the keep.all=F argument:

flipTables::Cbind(Prefer,Attitudes, keep.all=F)

This function matches on row name so does not require your tables to have the same number of rows or the same row names in the same order. Note, you can merge more than two tables at a time.

3. data.frame

If the data type between the two tables is different then you have the option of combining them as a data.frame:

data.frame(Prefer=Prefer[1:6],Attitudes,check.names=F)

This function requires your tables to have the same number of rows and will join the rows in order as they appear in the tables and does not force row names to match. This means we need to filter the preference table to only the first 6 rows and ensure the rows are consistent/as we'd like between tables.

The check.names=F argument is useful here as setting this to False will prevent R automatically replacing spaces in the column names with dots.

4. merge

A more flexible function is merge() which can deal with further scenarios. The equivalent code for our example is below:

#set tables as data.frames
Prefer = data.frame(Prefer=table.Preferred.Cola, check.names=F)
Attitudes = data.frame(table.Brand.Attitude, check.names=F)
#merge based on the row names
mytable = merge(Prefer, Attitudes, by=0)
#set first column as row names
rownames(mytable) = mytable[,1]
#return final result without the first column
mytable = mytable[,-1]

We begin by converting both tables to data.frames and setting the column name for the preference table (since it has only one column).
We then use the merge function to match by the common row names (the b=0 part of the code).
Finally, we update the row names with the brands that have been created in the first column by this function, and remove it from the final output. This is optional.

The merge function offers the options of matching by:

Row name: by=0
Specifying a common field: by="Brands"
Mapping different fields across the tables: by.x = "Brand", by.y = "Brands"
Mapping on multiple fields across the tables:
by.x = c("id","Brand"), by.y = c("ID","Brands")

And joining by:

Outer join - keep all rows in both tables: all=T
Left join - keep all rows in the first table and merge in matching from the second: all.x=T
Right join - keep all rows in the second table and merge in matching from the first: all.y=T

Note, you can only merge two tables at a time.

Articles in this section

Introduction

Requirements

Method 1 - Merge by Rows

1. rbind

2. Rbind

Method 2 - Merge by Columns

1. cbind

2. Cbind

3. data.frame

4. merge

See Also

Articles in this section

Introduction

Requirements

Method 1 - Merge by Rows

1. rbind

2. Rbind

Method 2 - Merge by Columns

1. cbind

2. Cbind

3. data.frame

4. merge

See Also

Related articles