mlxtend sklearn preprocessing techniques

Open israel-cj opened this issue 2 years ago • 1 comments

Dear,

I hope you are doing great! Thank you for the work done in mlxtend. I have the next problem: Let’s say I have a list of pipelines called ‘get_pipelines’, where each pipeline contains preprocessing steps such as ColumnTransformer, SimpleImputer, etc. Each pipeline independently works when I want to fit/predict on my dataset. Nevertheless, if I do the stacking the preprocessing steps are not being considered since I get errors saying my data should be transformed from categorical to numerical, etc. when that is already done for each pipeline. Is there a way to ask the Stacking to use such preprocessing? My code looks like this (THANK YOU):

from mlxtend.classifier import StackingClassifier
from sklearn.linear_model import LogisticRegression

# Create a list of base models
base_models = [make_pipeline(model) for model in get_pipelines]

# Create the meta-model
meta_model = LogisticRegression()

# Create the stacked ensemble
stacked_ensemble = StackingClassifier(
    classifiers=base_models,
    meta_classifier=meta_model,
    use_probas=True,
    average_probas=False
)

# Train the stacked ensemble on the training data
stacked_ensemble.fit(X, y)

Sep 25 '23 07:09 israel-cj

Hi there,

You could try fit_base_estimators=False, i.e.,

stacked_ensemble = StackingClassifier(
    classifiers=base_models,
    meta_classifier=meta_model,
    use_probas=True,
    average_probas=False,
    fit_base_estimators=False
)

Sep 25 '23 13:09 rasbt