data science tutorials and snippets prepared by tomis9
reshape2 is an R package that let’s you change the shape of any dataframe, i.e. to pivot and to “unpivot”.
Keep in mind that if your favourite R package for dataframes manipulation is data.table, functions dcast and melt are already in this package and work exactly the same as those in reshape2.
In fact there are only two functions worth mentioning: dcast, which is equivalent to MS Excel pivot table, and melt, which does the opposite or unpivots a table.
An example dataframe:
d <- data.frame(
account_no = paste(rep(7, 5), 1:5, sep=""),
Jan = rnorm(5, 10, 1),
Feb = rnorm(5, 10, 2),
Mar = rnorm(5, 10, 3)
)
print(d)
## account_no Jan Feb Mar
## 1 71 8.849874 9.052444 5.995447
## 2 72 8.153276 10.052375 9.850486
## 3 73 11.067459 12.403149 9.730900
## 4 74 11.506958 10.222173 14.384810
## 5 75 10.826462 9.549385 11.896931
Transormation into a normalized table (unpivot):
dn <- reshape2::melt(
data = d,
id.vars = "account_no",
variable.name = "month",
value.name = "revenue"
)
print(dn)
## account_no month revenue
## 1 71 Jan 8.849874
## 2 72 Jan 8.153276
## 3 73 Jan 11.067459
## 4 74 Jan 11.506958
## 5 75 Jan 10.826462
## 6 71 Feb 9.052444
## 7 72 Feb 10.052375
## 8 73 Feb 12.403149
## 9 74 Feb 10.222173
## 10 75 Feb 9.549385
## 11 71 Mar 5.995447
## 12 72 Mar 9.850486
## 13 73 Mar 9.730900
## 14 74 Mar 14.384810
## 15 75 Mar 11.896931
And back to the previous format using a pivot:
reshape2::dcast(
data = dn,
formula = account_no ~ month,
value.var = "revenue"
)
## account_no Jan Feb Mar
## 1 71 8.849874 9.052444 5.995447
## 2 72 8.153276 10.052375 9.850486
## 3 73 11.067459 12.403149 9.730900
## 4 74 11.506958 10.222173 14.384810
## 5 75 10.826462 9.549385 11.896931
A pretty nice and much longer tutorial is available here.