zoon
zoon copied to clipboard
Chain vs list in output
It makes more sense to list the output modules rather then Chain them to do what they currently do. Chains could be used to pass one output to another output
This makes sense - we'd have to enforce that output modules emit the same arguments as they ingest, with the option for adding extras.
Not a top priority though
I don't think you can put constraints on the output of an output module, though agree the current functionality of Chain for output modules is inappropriate.I dont see the need currently to be able to serially apply output modules
Agree it's not a priority. But I think you still must use Chain
for output modules. Or maybe just have to do it if you are using list
on another module. So the issue still stands that this isn't quite logical.
I plan to move to being able to use multiple lists, (ie could list model and output) and then do away with Chain on output modules
Sorry for being slow to this.
@AugustT:
I don't think you can put constraints on the output of an output module, though agree the current functionality of Chain for output modules is inappropriate.
My suggestion isn't that we constrain their output, but that by default they also return their inputs, and any other outputs are also returned.
I.e. PrintMap
takes as input:
function (.model, .ras, plot = TRUE, dir = NULL, size = c(480,480), res = 72)
plots a map as a side-effect and would then return:
return (list(.model, .ras, pred_ras))
(rather than just pred_ras
as it currently does)
I dont see the need currently to be able to serially apply output modules
The advantage of this is that you can have output modules which modify the predictions or rasters, and then pass the modified model/raster to another output modules, like PrintMap. E.g. making the predictions binary (as in the paper example & very common), or applying clamping (setting extreme covariate values to the nearest observed value before prediction) for predictions from a model which doesn't already do that (either by modifying .model
or .ras
). I actually think these are pretty common use cases, and it would be a shame if we couldn't handle them.
Most output modules either store their results in the workflow, or do something as a side effect, so I don't think there are (m)any cases where returning a list like this would cause problems. If there are places that is a problem, we could always format the output of the last module in a chain to remove the .model
& .ras
objects.
See example of a chainable module here
Thanks for laying the reasoning here, this looks sensible.
I want to make this as easy for the user as possible. Simply all we need is to:
- [ ] Require output modules to return a list with the named elements
.model
and.ras
. Changes needed inBuildModle
andCheckModule
as well as tests. - [ ] Modify existing output modules
- [ ] Update I/O definitions
- [ ] Update Build a Module vignette
:+1: