zoon
zoon copied to clipboard
copying old attributes breaks raster processing
I just spotted this in MESSMask
whilst working on the paper.
In process modules that act on rasters the attributes of the raster are captured at the beginning of the module then replaced at the end of the module. Unfortunately some of the processes applied to the raster are stored in the attributes, such as which cells are NA. These changes are therefore undone before returning the module
MRE
# simple workflow
work1 <- workflow(occurrence = UKAnophelesPlumbeus,
covariate = UKBioclim,
process = Background(n = 70),
model = LogisticRegression,
output = NoOutput)
# current MESSMask
LoadModule('MESSMask')
# version that doesn't replace attributes
MESSMask2 <- function (.data) {
zoon:::GetPackage('dismo')
# Dont allow attributes to be dropped
org_attrs <- attributes(.data$ras)
# Ignore warnings because they're just a bunch of NA issues
suppressWarnings(
mess <- dismo::mess(.data$ras, v = .data$df[, attr(.data$df, 'covCols')], full = FALSE)
)
# Extreme values become NA
values(mess)[values(mess) < 0] <- NA
.data$ras <- mask(.data$ras, mess)
# Add original attributes
# attributes(.data$ras) <- c(attributes(.data$ras), org_attrs)
return (.data)
}
# run both
ras <- MESSMask(work1$process.output[[1]])$ras
ras2 <- MESSMask2(work1$process.output[[1]])$ras
# plot both
par(mfrow = c(1, 2))
plot(ras[[1]])
plot(ras2[[1]])
I know we went through a couple of design options for making sure attributes are passed on, but I can't remember what they were.
Could we instead have a list of allowed attributes, where only these are enforced? We could store that in a helper function for module developers to make this clearer. E.g. listing what's allowed:
ZoonAttributes()
[1] "some attribute" "another allowed attribute" "etc."
getting the attributes
temp_attributes <- ZoonAttributes(.data)
and setting them
ZoonAttributes(.data) <- temp_attributes
what do you think @AugustT?
I changed this feature in MESSmask specifically because the presence of the na.omit attribute caused errors further down the line, I didn't realise removing the attribute 'undid' the setting of NAs! Good to know! I can't now remember exactly what the problem was but it was something to do with the na attribute suggesting there was missing data when there wasn't , perhaps after a Chain? Sorry I cant be more help
Seeing as I can't remember the original issue your solution sounds fine. But just watch out for times where the na attibute might cause problems.
OK cool, if there was a specific MESSMask issue, I'll just see if I can fix that for now.
This is fixed for MESSMask now. The solution I went with was to only put back into the attributes those which were missing after processing (in this case the call path):
# replace any attributes lost during processing
end_attrs <- attributes(.data$ras)
which_missing <- which(!names(org_attrs) %in% names(end_attrs))
attributes(.data$ras) <- c(end_attrs, org_attrs[which_missing])