DALEX icon indicating copy to clipboard operation
DALEX copied to clipboard

Re-engineering and parallelization of computations

Open pbiecek opened this issue 2 years ago • 1 comments

Candidate features for version 3.0:

  • Absorbing needed functions from iBreakDown and ingredients. Shorter dependency list and easier package modification and maintenance.
  • Parallelize the calculation of explanations such as model_profile, model_parts (which framework)?

pbiecek avatar May 05 '22 17:05 pbiecek

I do not know if this is the appropriate place to comment. I apologize if this is inappropriate. I am working on using spark_apply to calculate predict_parts in parallel for an H2O model and I am trying to understand explain.

H2O offers a prediction function without the cluster running, but it uses a path to a mojo and model gen file. I can write a custom predict function to use this to produce an identical prediction vector to the dalextra custom prediction function for explain_h2o.

However there is no 'model' object just a path to a mojo and model gen file.

In explain the predict function will pick up the path, and produce predictions.

I guess I am asking if this is appropriate?

Otherwise, I have been running out of memory when starting H2O on each worker node.

Thank you for the help, and sorry if this is inappropriate.

ssefick avatar Feb 01 '23 03:02 ssefick