Application to GradientBoostingTree class

Open roeldobbe opened this issue 9 years ago • 1 comments

Hi Ando,

Thanks for this wonderful package, makes my life a lot easier!

The treeinterpreter does not seem to work for the class GradientBoostingTree. It has a bug because this class does not output n_output, which is checked in your code to ensure the model has a univariate output.

This might be a quick fix. Would it be possible to do this?

I used the below code to test it.

Thanks, Roel

----code----

import numpy as np

from sklearn.metrics import mean_squared_error from sklearn.datasets import make_friedman1 from sklearn.ensemble import GradientBoostingRegressor

X, y = make_friedman1(n_samples=1200, random_state=0, noise=1.0) X_train, X_test = X[:200], X[200:] y_train, y_test = y[:200], y[200:] gbt = GradientBoostingRegressor(n_estimators=100, learning_rate=0.1, ... max_depth=1, random_state=0, loss='ls').fit(X_train, y_train) mean_squared_error(y_test, est.predict(X_test))

X.shape

instances = X[300:309,:] print "Instance 0 prediction:", gbt.predict(instances[0]) print "Instance 1 prediction:", gbt.predict(instances[1])

prediction, bias, contributions = ti.predict(gbt, instances)

Jul 08 '16 01:07 roeldobbe

Hi Ando !

Would you still be interested in a scikit GBT version of your package ? I needed one, so I adapted your code and it now runs for Random forests as well as GBT, here: https://github.com/marcbllv/treeinterpreter

I also changed one thing in _predict_forest: now predictions, biases and contributions are preallocated (line 110) before instead of computing everything then averaging. That ensures a more reasonable use of memory.

And like roeldobbe said, thank you for that really nice package ! Cheers Marc

Aug 10 '16 09:08 marcbllv