dtreeviz
dtreeviz copied to clipboard
Color keyword argument - Value error
I am working on a binary classification problem using lightGBM. The model was trained on 42 features. The training dataset size is (78000, 42) - 78000 observations spanning across 42 features The test dataset size is (25220, 42)
Using dtreeviz on my trained model:
viz = dtreeviz.model(gm, tree_index = 0, X_train = X_train, y_train=Y_train, feature_names = features, target_name="A", class_names = ["A", "B"])
When I execute viz.view()
I am facing the following error:
ValueError: The 'color' keyword argument must have one color per dataset, but 1 datasets and 0 colors were provided
Any thoughts on how to go about this?
Is it something similar with https://github.com/parrt/dtreeviz/issues/280?
I am facing the same error.
Here is the image which contains some details:
@tlapusan Yes, the error description is the same as the one posted by @baligoyem
Have you ever faced the AttributeError, which its description is 'Rectangle' object has no attribute 'patches'?
I am asking this question because I have sometimes randomly faced these two errors that are related to each other, I believe.
did you try the latest version of dtreeviz ?
yes, I did. But it did not resolve.
Using colour-0.1.5 and dtreeviz-2.2.1 .. No luck at all
could you provide a google collab or any kind of shareable notebook so I could reproduce your issue?
@tlapusan Sorry for responding this late. Unfortunately, I can't share the notebook as the data and features used is confidential :(
+1
+1
+1
+1
In my case (dtreeviz=2.2.2), it seems to be a precision problem from the get_thresholds
method. If you have small float thresholds, samples are assigned to wrong paths. In some cases, some nodes may end up with no samples.
class ShadowLightGBMTree(ShadowDecTree):
...
def get_thresholds(self) -> np.ndarray:
if self.thresholds is not None:
return self.thresholds
node_thresholds = [-1] * self.nnodes()
for i in range(self.nnodes()):
if self.children_left[i] != -1 and self.children_right[i] != -1:
if self.is_categorical_split(i):
node_thresholds[i] = list(map(int, self.tree_nodes[i]["threshold"].split("||")))
else:
### thresholds are ROUNDED!
node_thresholds[i] = round(self.tree_nodes[i]["threshold"], 2)
self.thresholds = np.array(node_thresholds, dtype=object)
return self.thresholds
No sample -> No color mapped -> this problem.
+1 on 2.2.2, any workarounds?
I dealt with this issue. You should ensure that the data you use to train the LGBM model is the same as the data for visualization.