data science tutorials and snippets prepared by tomis9
reshape2
is an R package that let’s you change the shape of any dataframe, i.e. to pivot and to “unpivot”.
Keep in mind that if your favourite R package for dataframes manipulation is data.table, functions dcast and melt are already in this package and work exactly the same as those in reshape2
.
In fact there are only two functions worth mentioning: dcast, which is equivalent to MS Excel pivot table, and melt, which does the opposite or unpivots a table.
An example dataframe:
d <- data.frame(
account_no = paste(rep(7, 5), 1:5, sep=""),
Jan = rnorm(5, 10, 1),
Feb = rnorm(5, 10, 2),
Mar = rnorm(5, 10, 3)
)
print(d)
## account_no Jan Feb Mar
## 1 71 8.849874 9.052444 5.995447
## 2 72 8.153276 10.052375 9.850486
## 3 73 11.067459 12.403149 9.730900
## 4 74 11.506958 10.222173 14.384810
## 5 75 10.826462 9.549385 11.896931
Transormation into a normalized table (unpivot):
dn <- reshape2::melt(
data = d,
id.vars = "account_no",
variable.name = "month",
value.name = "revenue"
)
print(dn)
## account_no month revenue
## 1 71 Jan 8.849874
## 2 72 Jan 8.153276
## 3 73 Jan 11.067459
## 4 74 Jan 11.506958
## 5 75 Jan 10.826462
## 6 71 Feb 9.052444
## 7 72 Feb 10.052375
## 8 73 Feb 12.403149
## 9 74 Feb 10.222173
## 10 75 Feb 9.549385
## 11 71 Mar 5.995447
## 12 72 Mar 9.850486
## 13 73 Mar 9.730900
## 14 74 Mar 14.384810
## 15 75 Mar 11.896931
And back to the previous format using a pivot:
reshape2::dcast(
data = dn,
formula = account_no ~ month,
value.var = "revenue"
)
## account_no Jan Feb Mar
## 1 71 8.849874 9.052444 5.995447
## 2 72 8.153276 10.052375 9.850486
## 3 73 11.067459 12.403149 9.730900
## 4 74 11.506958 10.222173 14.384810
## 5 75 10.826462 9.549385 11.896931
A pretty nice and much longer tutorial is available here.