actigraph.sleepr
actigraph.sleepr copied to clipboard
Cole Kripke 10-second and 30-second variant: Does actigraph.sleepr facilitate aggregation based on max per minute?
Thanks for sharing your code.
I see you use Cole Kripke 60 seconds as default, but also implemented the Cole Kripke 10- and 30-second variant in function apply_cole_kripke.
Do I understand correctly that if I want to use those variants, I would have to first pre-process my data to first find the maximum 10- or 30 second epoch per minute and then use those as input to the apply_cole_kripke function? At least that is what Cole Kripke did in their paper. I have been trying to find out whether you already wrote such a pre-processing function, but couldn't find it.
If you could clarify that would be much appreciated.
Hello
As best as I can remember, I implemented the 10sec and 30sec versions from the equations in the Cole, Kripke et al. article, without any testing. That's why these versions are not exposed by the apply_cole_kripke
function.
And, yes, I haven't implemented a pre-processing function to find the maximum 10s or 30s epoch per minute.
Now I'm wondering how to implement this preprocessing step.....
Can you verify that I understand correctly what "the maximum 10sec/30sec nonoverlapping epoch of activity per minute" means?
library("lubridate")
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
library("tidyverse")
# Create a minute of data for illustration
input <-
tibble::tribble(
~timestamp, ~count,
"2012-06-27 10:54:00", 377L,
"2012-06-27 10:54:10", 465L,
"2012-06-27 10:54:20", 505L,
"2012-06-27 10:54:30", 73L,
"2012-06-27 10:54:40", 45L,
"2012-06-27 10:54:50", 0L
) %>%
mutate(
across(timestamp, as.Date)
)
input
#> # A tibble: 6 × 2
#> timestamp count
#> <date> <int>
#> 1 2012-06-27 377
#> 2 2012-06-27 465
#> 3 2012-06-27 505
#> 4 2012-06-27 73
#> 5 2012-06-27 45
#> 6 2012-06-27 0
# Find the maximum 10-second nonoverlapping epoch per minute
input %>%
# The data is already at 10sec frequency,
# so just find the 10sec window with the largest count
group_by(
timestamp = floor_date(timestamp, unit = "minute")
) %>%
slice_max(
count
)
#> # A tibble: 1 × 2
#> # Groups: timestamp [1]
#> timestamp count
#> <dttm> <int>
#> 1 2012-06-27 00:00:00 505
# Find the maximum 30-second nonoverlapping epoch per minute
input %>%
# The data is at 10sec frequency,
# so first we aggregate by summing the counts within each 30sec window
group_by(
timestamp = floor_date(timestamp, unit = "30 seconds")
) %>%
summarise(
across(count, sum)
) %>%
# Then as above, we find the 30sec window with the largest count
group_by(
timestamp = floor_date(timestamp, unit = "minute")
) %>%
slice_max(
count
)
#> # A tibble: 1 × 2
#> # Groups: timestamp [1]
#> timestamp count
#> <dttm> <int>
#> 1 2012-06-27 00:00:00 1465
Created on 2022-03-12 by the reprex package (v2.0.1)
Sorry for slow reply, I wasn't sure how to respond as I also do not fully understand how they did it. In the non-overlapping variant I think they worked with count values per 10 seconds and then looked for the most active 10 seconds per 60 seconds and used that as final indicator of movement per minute. However, what is unclear to me is whether they stick to the unit of counts per 10 seconds or convert it to counts per minute. Maybe the fact that it is not mentioned means that they do not make the conversion.
Best possible solution may be to try both and see which one provides the best estimate.