DALEX
DALEX copied to clipboard
Re-engineering and parallelization of computations
Candidate features for version 3.0:
- Absorbing needed functions from iBreakDown and ingredients. Shorter dependency list and easier package modification and maintenance.
- Parallelize the calculation of explanations such as model_profile, model_parts (which framework)?
I do not know if this is the appropriate place to comment. I apologize if this is inappropriate. I am working on using spark_apply to calculate predict_parts in parallel for an H2O model and I am trying to understand explain.
H2O offers a prediction function without the cluster running, but it uses a path to a mojo and model gen file. I can write a custom predict function to use this to produce an identical prediction vector to the dalextra custom prediction function for explain_h2o.
However there is no 'model' object just a path to a mojo and model gen file.
In explain the predict function will pick up the path, and produce predictions.
I guess I am asking if this is appropriate?
Otherwise, I have been running out of memory when starting H2O on each worker node.
Thank you for the help, and sorry if this is inappropriate.