bartpy icon indicating copy to clipboard operation
bartpy copied to clipboard

Support feature importance / variable selection

Open JakeColtman opened this issue 6 years ago • 1 comments

In many real world use cases, it's important to be able to identity truly important features.

Implementing some of the approaches of https://repository.upenn.edu/cgi/viewcontent.cgi?article=1555&context=statistics_papers seems like a good start.

A side constraint is that the solution should be able to scale to large datasets, which might pose a problem for the permutation approach. Possibly it would be useful to have two different modes - a fully principled one and a rough and ready one for large data sets.

JakeColtman avatar Oct 13 '18 15:10 JakeColtman

Given the claims in the paper, it would be interesting for the solution to be general enough that it could be applied to implementations of models like RF in other libraries

JakeColtman avatar Oct 13 '18 15:10 JakeColtman