Terence Parr

Results 554 comments of Terence Parr

There are a couple of changes such as in interpretation.py that might affect other targets. Have you run through and checked the other kinds of trees and run tests?

Thanks, @tlapusan !

Label-encoding categorical variables for decision trees or random forests is the easiest and most correct solution. That means converting every category level to a unique integer so it sounds like...

Can you use the LabelEncoder()? Or do it manually via [https://github.com/parrt/stratx/blob/master/notebooks/support.py](https://github.com/parrt/stratx/blob/master/notebooks/support.py) ``` def df_string_to_cat(df:pd.DataFrame) -> dict: catencoders = {} for colname in df.columns: if is_string_dtype(df[colname]) or is_object_dtype(df[colname]): df[colname] = df[colname].astype('category').cat.as_ordered()...

ooh! you can't use ``` df[["feature_1", "feature_2"]], ``` gotta use encoded feature 1

I do it all the time. have you looked at the examples?

@tlapusan did we break something? Can you take a look and help @pplonski ?

Hang on. you're not talking about the target. ok, let me look.

@tlapusan ha! We *don't* have an example where the tree nodes are cat vars! We should think about this. Nonetheless, you gotta pass in encoded vars to the classifier. We...

Hi @chenhajaj Cats are allowed but it shows their unique cat code at moment. So, we just need a way to indicate cat label but what if there are 10,000...