scikit-tree icon indicating copy to clipboard operation
scikit-tree copied to clipboard

Decision Function on each node

Open BeatrizLafuenteAlcazar opened this issue 1 year ago • 4 comments

Hello, I'm training an Oblique Decision Tree with 5 features as input, and max_depth of 4. When I plot the model.tree_.threshold, I get something like this:

  array([-0.14181135,  0.09716574,  0.12667369, -0.96893096, -2.        ,
                   -2.        ,  0.78663591, -2.        , -2.        , -1.14851594,
                   -2.        ,  0.32079886, -2.        , -2.        , -0.69028786,
                   -0.52701822,  0.01895713, -2.        , -2.        ,  0.02781246,
                   -2.        , -2.        ,  0.35172133, -0.07838123, -2.        ,
                   -2.        ,  0.59768865, -2.        , -2.        ])

I would like to know what are the linear combinations of features that are taking place in each node, and what weight is being assigned to each feature. Is there I way I can know this?

Thank you in advance

BeatrizLafuenteAlcazar avatar Jan 31 '24 09:01 BeatrizLafuenteAlcazar

I think you are looking for the projection matrix and want to expose a Python API to access it?

There is already tree.tree_.get_projection_matrix(). However, there is no Python API. If you are interested in contributing a PR and a relevant unit-test, you can add a class method to the ObliqueDecisionTreeClassifier/Regressor, PatchObliqueDecisionTreeClassifier/Regressor, ExtraObliqueDecisionTreeClassifier/Regressor, then I can help review your PR.

possibly something like

def projection_matrix_(self):
     # should only work if is fitted, otw error out

     # return array

adam2392 avatar Jan 31 '24 21:01 adam2392

According to the mathematical formulation of oblique trees found on the project website, "the data at node m be represented by Qm with nm samples. For each candidate split (ai, tm) consisting of a (possibly sparse) vector ai and threshold m, partition the data into Qm,left and Qm, right subsets"

image What I need is the vector ai. The projection matrix that I get for my model is something like:
   array([[ 0., -1.,  1.,  0.,  0.],
         [ 0.,  0.,  1., -1.,  0.],
         [ 0.,  0.,  0.,  1.,  0.],
         [ 0.,  0.,  0., -1., -1.],
         [ 0.,  0.,  0.,  0.,  0.],
         [ 0.,  0.,  0.,  0.,  0.],
         ...
         [ 0.,  0.,  0.,  0.,  0.]])

I was expecting the vector ai to have real-valued entries, also different from {-1,1}, belonging to the weights assigned to each feature on each node.

So the way I am interpreting the matrix I now get is that, for instance, on the first node, feature_1 is being subtracted from feature_2 and if that value is larger than the threshold for that node, then the sample moves to the left child, and so on. In other words, the feature coefficients are always 1 or -1, and the features not used on each node have a coefficient of 0. Is this the proper way of interpreting such a matrix?

BeatrizLafuenteAlcazar avatar Feb 01 '24 08:02 BeatrizLafuenteAlcazar

Yep seems right

adam2392 avatar Feb 01 '24 13:02 adam2392

Ok, that clears it up, thank you so much!

BeatrizLafuenteAlcazar avatar Feb 01 '24 13:02 BeatrizLafuenteAlcazar