zoon icon indicating copy to clipboard operation
zoon copied to clipboard

Reproducible workflow when arguments are objects

Open timcdlucas opened this issue 8 years ago • 3 comments

Related to #192

It seems reasonable that people might use objects to define arguments. However, for the workflow object to be reproducible we would want to save the call as if it wasn't an object.

k = 2

work2 <- workflow(occurrence = UKAnophelesPlumbeus,
                  covariate  = UKAir,
                  process    = BackgroundAndCrossvalid(k = k),
                  model      = LogisticRegression,
                  output     = PerformanceMeasures)


RerunWorkflow(work2)

Caught errors:
Error in 1:k: NA/NaN argument

...

===================

Call: workflow(occurrence = UKAnophelesPlumbeus, covariate = UKAir, process = BackgroundAndCrossvalid(k = k), model = LogisticRegression, output = PerformanceMeasures, forceReproducible = FALSE) 

We would want the call to be saved as k=2.

timcdlucas avatar Oct 10 '15 10:10 timcdlucas

Seems sensible for this use case. We could handle this by checking whether arguments are objects in the calling environment and then dputing them.

That approach would be awful for large objects (e.g. rasters passed to PredictNewAreaMap #145) though. An alternative would be to store all the objects used in the workflow object.

goldingn avatar Oct 10 '15 13:10 goldingn

Yes think the latter is a better general idea. Then RerunWorkflow needs to know where to find those objects. Possibly by writing those things from the workflow object to the global environment at the beginning.

timcdlucas avatar Oct 10 '15 14:10 timcdlucas

Rather than writing to global, we could:

  1. define a new environment obj_env in workflow
  2. copy the named objects from global into obj_env
  3. set that obj_env as the place to look for named objects
  4. return obj_env in the workflow object

Then rerunworkflow could just fetch objects from obj_env in the workflow object.

goldingn avatar Oct 10 '15 14:10 goldingn