mlxtend icon indicating copy to clipboard operation
mlxtend copied to clipboard

Choosing which probability column to drop in StackingClassifier and StackingCVClassifier

Open rasbt opened this issue 6 years ago • 0 comments

Currently, we have a "drop_last_proba" parameter, which drops the last "probability" column in the feature set if it is set to True, because it is redundant: p(y_c) = 1 - p(y_1) + p(y_2) + ... + p(y_{c-1}). This can be useful for meta-classifiers that are sensitive to perfectly collinear features.

As mentioned by @bmreiniger in #527 , it might be useful to be able to choose whether the first or last column is to be dropped. E.g., this parameter could be renamed to:

  • drob_proba_column

with allowed arguments {None, 'last', 'first'}

rasbt avatar Sep 20 '19 15:09 rasbt