soilDB
soilDB copied to clipboard
fetchHenry() NA-padding for weekly / monthly granularity
TODO:
- [ ] generalize
.fill_missing_days()
with new function andgran
argument: days, weeks, months - [ ] adapt
.formatDates()
with new function - [ ] new argument to
fetchHenry()
for generic NA padding - [ ] updated docs and tutorial
Further research: https://stackoverflow.com/questions/22439540/how-to-get-week-numbers-from-dates
First approximation here.
.fillMissingGran <- function(x, gran) {
## TODO this doesn't account for leap-years
# 366 days
# 53 weeks
# sequence of possible values
g.vect <- switch(
gran,
'day' = 1:365,
'week' = 1:52,
'month' = 1:12
)
# column to use
# week / month_numeric are missing
g.col <- switch(
gran,
'day' = 'doy',
'week' = 'week',
'month' = 'month_numeric'
)
# format string
g.fmt <- switch(
gran,
'day' = '%Y %j %H:%M',
'week' = '%Y %W %H:%M',
'month' = '%Y %m %H:%M'
)
# add time ID columns as-needed
# doi is always present
## "week" not as simple as it seems
# https://stackoverflow.com/questions/22439540/how-to-get-week-numbers-from-dates
# week
if(gran == 'week') {
x$week <- as.integer(format(x$date_time, '%W'))
}
# month
if(gran == 'month') {
x$month_numeric <- as.integer(format(x$date_time, '%m'))
}
# ID missing time IDs
missing <- which(is.na(match(g.vect, x[[g.col]])))
# short-circuit
if (length(missing) < 1) {
return(x)
}
# make fake date-times for missing time IDs
fake.datetimes <- paste0(x$year[1], ' ', missing, ' 00:00')
# TODO: this will result in timezone specific to locale;
# especially an issue when granularity is less than daily or for large extents
fake.datetimes <- as.POSIXct(fake.datetimes, format = g.fmt)
# generate DF with missing information
fake.data <- data.frame(
sid = x$sid[1],
date_time = fake.datetimes,
year = x$year[1],
doy = missing.days,
month = format(fake.datetimes, "%b")
)
fill.cols <- which(!colnames(x) %in% colnames(fake.data))
if (length(fill.cols) > 0) {
na.data <- as.data.frame(x)[, fill.cols, drop = FALSE][0,, drop = FALSE][1:nrow(fake.data),, drop = FALSE]
fake.data <- cbind(fake.data, na.data)
}
# make datatypes for time match
x$date_time <- as.POSIXct(x$date_time, format = "%Y-%m-%d %H:%M:%S")
# splice in missing data
y <- rbind(x, fake.data)
# re-order by DOY and return
return(y[order(y$doy), ])
}
# generate example data
w <- fetchHenry(project = 'CA790', gran = 'week', soiltemp.summaries = FALSE, pad.missing.days = TRUE)
x <- w$soiltemp[w$soiltemp$sid == 392 & w$soiltemp$year == '1998', ]
plot(x$date_time, x$sensor_value, type = 'p')
.fillMissingGran(x, gran = 'week')
A note to extend methods where possible so that they can work with other data sources e.g. SCAN, CDEC
Looks like we will also need to change the usage of base::as.POSIXct()
format
argument in soilDB:::.fill_missing_days()
as it is breaking with R devel.
══ Failed tests ════════════════════════════════════════════════════════════════
── Error (test-fetchHenry.R:122:3): summarizeSoilTemperature() works as expected ──
Error in `.POSIXct(x, tz, ...)`: unused argument (format = "%Y-%m-%d %H:%M:%S")
Backtrace:
▆
1. ├─soilDB:::.formatDates(x, gran = "day", pad.missing.days = TRUE) at test-fetchHenry.R:122:2
2. │ ├─...[]
3. │ └─data.table:::`[.data.table`(...)
4. └─soilDB:::.fill_missing_days(.SD)
5. ├─base::as.POSIXct(x$date_time, format = "%Y-%m-%d %H:%M:%S")
6. └─base::as.POSIXct.default(x$date_time, format = "%Y-%m-%d %H:%M:%S")
── Error (test-fetchHenry.R:165:3): .fill_missing_days() works as expected ─────
Error in `.POSIXct(x, tz, ...)`: unused argument (format = "%Y-%m-%d %H:%M:%S")
Backtrace:
▆
1. └─soilDB:::.fill_missing_days(x) at test-fetchHenry.R:165:2
2. ├─base::as.POSIXct(x$date_time, format = "%Y-%m-%d %H:%M:%S")
3. └─base::as.POSIXct.default(x$date_time, format = "%Y-%m-%d %H:%M:%S")
I'll try to take a look next week sometime, unless you have time before then. Can you tackle the POSIX thing?
I'll try to take a look next week sometime, unless you have time before then.
Take a look at this issue as a whole? I can probably take a crack at it this week sometime
Can you tackle the POSIX thing?
This is sorted w/ https://github.com/ncss-tech/soilDB/commit/6d4c02b553b52f67ffd4b0da9d8ae15c2c9ad0f4 as.Date()
still takes format arg, so I converted character->Date explicitly with as.Date(..., format=)
and then to POSIXct and we are good
I'll try to take a look next week sometime, unless you have time before then.
Take a look at this issue as a whole? I can probably take a crack at it this week sometime
Go for it if you have some time. I'm not going to have enough time this week.
Can you tackle the POSIX thing?
This is sorted w/ 6d4c02b
as.Date()
still takes format arg, so I converted character->Date explicitly withas.Date(..., format=)
and then to POSIXct and we are good
Thanks, the as.Date(
fix was news to me.
I'll try to take a look next week sometime, unless you have time before then.
Take a look at this issue as a whole? I can probably take a crack at it this week sometime
Go for it if you have some time. I'm not going to have enough time this week.
Can you tackle the POSIX thing?
This is sorted w/ 6d4c02b
as.Date()
still takes format arg, so I converted character->Date explicitly withas.Date(..., format=)
and then to POSIXct and we are good
Thanks, the as.Date(
fix was news to me.