the table will go through each variable for each activity, for each subject and give us the value
Dcast to Calculate Mean
The dcast function will recast the data set into a particular shape/data frame.
NOTE: using dcast will allow us to make calculations vertically, in other words it will subset the data vertically into groups and process the function (in this case mean)
Case Study
measure the mean/average value of each variable for each subject while performing each activity
remember there are 30 subjects and 6 activities
so we should have 180 rows and 83 columns of mean values
as you’ll see below dcast will scan through all 30 subjects starting with the first activity “laying” and calculate the function (mean) for each column
dcast & melt
The mean of each measurement/variable for each subject while performing each activity. Will use the melted data
In the section above we used dcast & melt to calculate the mean for each variable vertically per group and we had 81 variable. But if we want to calculate the mean of all the variables for each group then we need to calculate it on the melted data prior to using dcast (because dcast recasts the data from narrow and long to wide)
Remember what meltdata looked like (narrow and long)
gt(meltdata[1:5,]) |>opt_table_outline()
activity
subject
variable
value
standing
2
tBodyAcc-mean()-X
0.2571778
standing
2
tBodyAcc-mean()-X
0.2860267
standing
2
tBodyAcc-mean()-X
0.2754848
standing
2
tBodyAcc-mean()-X
0.2702982
standing
2
tBodyAcc-mean()-X
0.2748330
Case Study
using meltdata which is 813621 X 4 calculate the mean of all variables for each activity
aggregate & melt 1 group
The mean for all variables together for each activity using melt
What if we just want the mean of all variables/measurements for each activity for only one user/subject #6?
We already calculated the mean of all variables for each activity above in aggregate & melt 1 group
Let’s look at it again
Mean of all vars/activity
activity
value
laying
-0.6010010
sitting
-0.6028592
standing
-0.6139060
walking
-0.2344983
walking_downstairs
-0.1476207
walking_upstairs
-0.2904126
aggregate & filter
Mean of all vars/activity for Subject #6
subset(meltdata, subject =="6") |>aggregate(value ~ activity, mean) |>gt() |>opt_table_outline() |>tab_header( title ="Mean of all vars/activity",subtitle ="for Subject #6")
Mean of all vars/activity
for Subject #6
activity
value
laying
-0.582493068
sitting
-0.591247924
standing
-0.608446225
walking
-0.189527910
walking_downstairs
-0.001814892
walking_upstairs
-0.186050139
Split
Case Study
using meltdata which is 813621 X 4 calculate the mean of all variables for each activity
Don’t use melt, aggregate
split
lapply
colMeans
Split data by activity
calculate the mean of each of the 81 columns using lapply & colMeans
now you’ll have a 6X81 object with all the means for each variable per activity
use lapply & mean to calculate the mean of each row
now, you have a list of 6 means for each of the 6 acitivities
convert the list to df so we can plot it in gt
s <-split(extracteddata,extracteddata$activity)# s is a list of 6 dfs one for each activity result <-lapply(s, function(extracteddata) colMeans(extracteddata[ ,3:81]))# result is a list of 6 dfs each one has 81 columns one for each mean of that variable #(a list of means)act_mean <-lapply(result, mean)# act_mean is a list of means for each activity - convert to df and create tabledata.frame(act_mean)|>gt() |>opt_table_outline()