sparkxgb
sparkxgb copied to clipboard
feature request: Feature Importance of models
would it be possible to link features importance into the package?
This probably requires to update xgboost4j-spark to at least 0.82 to use Booster.getScore()
:
https://github.com/dmlc/xgboost/commit/431c850c03edd97a126707d2f04f251838392fe9
sparkxgb currently uses 0.81:
https://github.com/rstudio/sparkxgb/blob/db7a68a3f1b1ce675e05b2c79da4992c4a2f17bf/R/dependencies.R#L11
(I tried to make a PR for this, but couldn't find how to use sparklyr::compile_package_jars()
with a dependency jar...)
I would try to bump the version in dependencies.R
to the newer version, that's about it. To recompile the jars
you need to run https://github.com/rstudio/sparkxgb/blob/master/configure.R but I don't think that's required since the Scala code is not changed. Would be great if you could send a PR cause that way we can see what breaks in the tests and help out a bit as well.
Thanks!
I don't think that's required since the Scala code is not changed.
But, doesn't inst/java/sparkxgb-*.jar
contain all the things including the dependency? If so, changing the dependency should require rebuild. Sorry, I don't know well about Java ecosystem and the structure of a JAR....