Application to GradientBoostingTree class
Hi Ando,
Thanks for this wonderful package, makes my life a lot easier!
The treeinterpreter does not seem to work for the class GradientBoostingTree. It has a bug because this class does not output n_output, which is checked in your code to ensure the model has a univariate output.
This might be a quick fix. Would it be possible to do this?
I used the below code to test it.
Thanks, Roel
----code----
import numpy as np
from sklearn.metrics import mean_squared_error from sklearn.datasets import make_friedman1 from sklearn.ensemble import GradientBoostingRegressor
X, y = make_friedman1(n_samples=1200, random_state=0, noise=1.0) X_train, X_test = X[:200], X[200:] y_train, y_test = y[:200], y[200:] gbt = GradientBoostingRegressor(n_estimators=100, learning_rate=0.1, ... max_depth=1, random_state=0, loss='ls').fit(X_train, y_train) mean_squared_error(y_test, est.predict(X_test))
X.shape
instances = X[300:309,:] print "Instance 0 prediction:", gbt.predict(instances[0]) print "Instance 1 prediction:", gbt.predict(instances[1])
prediction, bias, contributions = ti.predict(gbt, instances)
Hi Ando !
Would you still be interested in a scikit GBT version of your package ? I needed one, so I adapted your code and it now runs for Random forests as well as GBT, here: https://github.com/marcbllv/treeinterpreter
I also changed one thing in _predict_forest: now predictions, biases and contributions are preallocated (line 110) before instead of computing everything then averaging. That ensures a more reasonable use of memory.
And like roeldobbe said, thank you for that really nice package ! Cheers Marc