zoon icon indicating copy to clipboard operation
zoon copied to clipboard

Chain vs list in output

Open timcdlucas opened this issue 10 years ago • 8 comments

It makes more sense to list the output modules rather then Chain them to do what they currently do. Chains could be used to pass one output to another output

timcdlucas avatar Sep 23 '14 12:09 timcdlucas

This makes sense - we'd have to enforce that output modules emit the same arguments as they ingest, with the option for adding extras.

Not a top priority though

goldingn avatar Aug 17 '15 09:08 goldingn

I don't think you can put constraints on the output of an output module, though agree the current functionality of Chain for output modules is inappropriate.I dont see the need currently to be able to serially apply output modules

AugustT avatar May 13 '16 13:05 AugustT

Agree it's not a priority. But I think you still must use Chain for output modules. Or maybe just have to do it if you are using list on another module. So the issue still stands that this isn't quite logical.

timcdlucas avatar May 13 '16 13:05 timcdlucas

I plan to move to being able to use multiple lists, (ie could list model and output) and then do away with Chain on output modules

AugustT avatar May 13 '16 14:05 AugustT

Sorry for being slow to this.

@AugustT:

I don't think you can put constraints on the output of an output module, though agree the current functionality of Chain for output modules is inappropriate.

My suggestion isn't that we constrain their output, but that by default they also return their inputs, and any other outputs are also returned.

I.e. PrintMap takes as input:

function (.model, .ras, plot = TRUE, dir = NULL, size = c(480,480), res = 72)

plots a map as a side-effect and would then return:

return (list(.model, .ras, pred_ras))

(rather than just pred_ras as it currently does)

I dont see the need currently to be able to serially apply output modules

The advantage of this is that you can have output modules which modify the predictions or rasters, and then pass the modified model/raster to another output modules, like PrintMap. E.g. making the predictions binary (as in the paper example & very common), or applying clamping (setting extreme covariate values to the nearest observed value before prediction) for predictions from a model which doesn't already do that (either by modifying .model or .ras). I actually think these are pretty common use cases, and it would be a shame if we couldn't handle them.

Most output modules either store their results in the workflow, or do something as a side effect, so I don't think there are (m)any cases where returning a list like this would cause problems. If there are places that is a problem, we could always format the output of the last module in a chain to remove the .model & .ras objects.

goldingn avatar May 21 '16 01:05 goldingn

See example of a chainable module here

goldingn avatar May 21 '16 02:05 goldingn

Thanks for laying the reasoning here, this looks sensible.

I want to make this as easy for the user as possible. Simply all we need is to:

  • [ ] Require output modules to return a list with the named elements .model and .ras. Changes needed in BuildModle and CheckModule as well as tests.
  • [ ] Modify existing output modules
  • [ ] Update I/O definitions
  • [ ] Update Build a Module vignette

AugustT avatar May 23 '16 09:05 AugustT

:+1:

goldingn avatar May 23 '16 10:05 goldingn