evalml
evalml copied to clipboard
AutoMLSearchException: All pipelines in the current AutoML batch produced a score of np.nan on the primary objective
I just put the problem_ Type="binary" becomes "multiclass"
- Beginning pipeline search *
Optimizing for Log Loss Multiclass. Lower score is better.
Using SequentialEngine to train and score pipelines. Searching up to 3 batches for a total of None pipelines. Allowed model families:
Evaluating Baseline Pipeline: Mode Baseline Multiclass Classification Pipeline Mode Baseline Multiclass Classification Pipeline fold 0: Encountered an error. Mode Baseline Multiclass Classification Pipeline fold 0: All scores will be replaced with nan. Fold 0: Exception during automl search: Multiclass pipelines require y to have 3 or more unique classes! Fold 0: Parameters: {'Label Encoder': {'positive_label': None}, 'Baseline Classifier': {'strategy': 'mode'}} Fold 0: Traceback: File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 238, in _train_and_score fitted_pipeline, hashes = train_pipeline(
File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 176, in train_pipeline cv_pipeline.fit(X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\utils\base_meta.py", line 19, in _set_fit return_value = method(self, X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\pipelines\classification_pipeline.py", line 66, in fit raise ValueError(
Mode Baseline Multiclass Classification Pipeline fold 1: Encountered an error.
Mode Baseline Multiclass Classification Pipeline fold 1: All scores will be replaced with nan.
Fold 1: Exception during automl search: Multiclass pipelines require y to have 3 or more unique classes!
Fold 1: Parameters:
{'Label Encoder': {'positive_label': None}, 'Baseline Classifier': {'strategy': 'mode'}}
Fold 1: Traceback:
File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 238, in _train_and_score fitted_pipeline, hashes = train_pipeline(
File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 176, in train_pipeline cv_pipeline.fit(X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\utils\base_meta.py", line 19, in _set_fit return_value = method(self, X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\pipelines\classification_pipeline.py", line 66, in fit raise ValueError(
Mode Baseline Multiclass Classification Pipeline fold 2: Encountered an error.
Mode Baseline Multiclass Classification Pipeline fold 2: All scores will be replaced with nan.
Fold 2: Exception during automl search: Multiclass pipelines require y to have 3 or more unique classes!
Fold 2: Parameters:
{'Label Encoder': {'positive_label': None}, 'Baseline Classifier': {'strategy': 'mode'}}
Fold 2: Traceback:
File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 238, in _train_and_score fitted_pipeline, hashes = train_pipeline(
File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 176, in train_pipeline cv_pipeline.fit(X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\utils\base_meta.py", line 19, in _set_fit return_value = method(self, X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\pipelines\classification_pipeline.py", line 66, in fit raise ValueError(
Mode Baseline Multiclass Classification Pipeline: Starting cross validation Finished cross validation - mean Log Loss Multiclass: nan
- Evaluating Batch Number 1 *
Logistic Regression Classifier w/ Label Encoder + Replace Nullable Types Transformer + Imputer + One Hot Encoder + Standard Scaler fold 0: Encountered an error.
Logistic Regression Classifier w/ Label Encoder + Replace Nullable Types Transformer + Imputer + One Hot Encoder + Standard Scaler fold 0: All scores will be replaced with nan.
Fold 0: Exception during automl search: Multiclass pipelines require y to have 3 or more unique classes!
Fold 0: Parameters:
{'Label Encoder': {'positive_label': None}, 'Imputer': {'categorical_impute_strategy': 'most_frequent', 'numeric_impute_strategy': 'mean', 'boolean_impute_strategy': 'most_frequent', 'categorical_fill_value': None, 'numeric_fill_value': None, 'boolean_fill_value': None}, 'One Hot Encoder': {'top_n': 10, 'features_to_encode': None, 'categories': None, 'drop': 'if_binary', 'handle_unknown': 'ignore', 'handle_missing': 'error'}, 'Logistic Regression Classifier': {'penalty': 'l2', 'C': 1.0, 'n_jobs': -1, 'multi_class': 'auto', 'solver': 'lbfgs'}}
Fold 0: Traceback:
File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 238, in _train_and_score fitted_pipeline, hashes = train_pipeline(
File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 176, in train_pipeline cv_pipeline.fit(X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\utils\base_meta.py", line 19, in _set_fit return_value = method(self, X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\pipelines\classification_pipeline.py", line 66, in fit raise ValueError(
Logistic Regression Classifier w/ Label Encoder + Replace Nullable Types Transformer + Imputer + One Hot Encoder + Standard Scaler fold 1: Encountered an error.
Logistic Regression Classifier w/ Label Encoder + Replace Nullable Types Transformer + Imputer + One Hot Encoder + Standard Scaler fold 1: All scores will be replaced with nan.
Fold 1: Exception during automl search: Multiclass pipelines require y to have 3 or more unique classes!
Fold 1: Parameters:
{'Label Encoder': {'positive_label': None}, 'Imputer': {'categorical_impute_strategy': 'most_frequent', 'numeric_impute_strategy': 'mean', 'boolean_impute_strategy': 'most_frequent', 'categorical_fill_value': None, 'numeric_fill_value': None, 'boolean_fill_value': None}, 'One Hot Encoder': {'top_n': 10, 'features_to_encode': None, 'categories': None, 'drop': 'if_binary', 'handle_unknown': 'ignore', 'handle_missing': 'error'}, 'Logistic Regression Classifier': {'penalty': 'l2', 'C': 1.0, 'n_jobs': -1, 'multi_class': 'auto', 'solver': 'lbfgs'}}
Fold 1: Traceback:
File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 238, in _train_and_score fitted_pipeline, hashes = train_pipeline(
File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 176, in train_pipeline cv_pipeline.fit(X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\utils\base_meta.py", line 19, in _set_fit return_value = method(self, X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\pipelines\classification_pipeline.py", line 66, in fit raise ValueError(
Logistic Regression Classifier w/ Label Encoder + Replace Nullable Types Transformer + Imputer + One Hot Encoder + Standard Scaler fold 2: Encountered an error.
Logistic Regression Classifier w/ Label Encoder + Replace Nullable Types Transformer + Imputer + One Hot Encoder + Standard Scaler fold 2: All scores will be replaced with nan.
Fold 2: Exception during automl search: Multiclass pipelines require y to have 3 or more unique classes!
Fold 2: Parameters:
{'Label Encoder': {'positive_label': None}, 'Imputer': {'categorical_impute_strategy': 'most_frequent', 'numeric_impute_strategy': 'mean', 'boolean_impute_strategy': 'most_frequent', 'categorical_fill_value': None, 'numeric_fill_value': None, 'boolean_fill_value': None}, 'One Hot Encoder': {'top_n': 10, 'features_to_encode': None, 'categories': None, 'drop': 'if_binary', 'handle_unknown': 'ignore', 'handle_missing': 'error'}, 'Logistic Regression Classifier': {'penalty': 'l2', 'C': 1.0, 'n_jobs': -1, 'multi_class': 'auto', 'solver': 'lbfgs'}}
Fold 2: Traceback:
File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 238, in _train_and_score fitted_pipeline, hashes = train_pipeline(
File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 176, in train_pipeline cv_pipeline.fit(X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\utils\base_meta.py", line 19, in _set_fit return_value = method(self, X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\pipelines\classification_pipeline.py", line 66, in fit raise ValueError(
Logistic Regression Classifier w/ Label Encoder + Replace Nullable Types Transformer + Imputer + One Hot Encoder + Standard Scaler: Starting cross validation Finished cross validation - mean Log Loss Multiclass: nan Random Forest Classifier w/ Label Encoder + Replace Nullable Types Transformer + Imputer + One Hot Encoder fold 0: Encountered an error. Random Forest Classifier w/ Label Encoder + Replace Nullable Types Transformer + Imputer + One Hot Encoder fold 0: All scores will be replaced with nan. Fold 0: Exception during automl search: Multiclass pipelines require y to have 3 or more unique classes! Fold 0: Parameters: {'Label Encoder': {'positive_label': None}, 'Imputer': {'categorical_impute_strategy': 'most_frequent', 'numeric_impute_strategy': 'mean', 'boolean_impute_strategy': 'most_frequent', 'categorical_fill_value': None, 'numeric_fill_value': None, 'boolean_fill_value': None}, 'One Hot Encoder': {'top_n': 10, 'features_to_encode': None, 'categories': None, 'drop': 'if_binary', 'handle_unknown': 'ignore', 'handle_missing': 'error'}, 'Random Forest Classifier': {'n_estimators': 100, 'max_depth': 6, 'n_jobs': -1}} Fold 0: Traceback: File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 238, in _train_and_score fitted_pipeline, hashes = train_pipeline(
File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 176, in train_pipeline cv_pipeline.fit(X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\utils\base_meta.py", line 19, in _set_fit return_value = method(self, X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\pipelines\classification_pipeline.py", line 66, in fit raise ValueError(
Random Forest Classifier w/ Label Encoder + Replace Nullable Types Transformer + Imputer + One Hot Encoder fold 1: Encountered an error.
Random Forest Classifier w/ Label Encoder + Replace Nullable Types Transformer + Imputer + One Hot Encoder fold 1: All scores will be replaced with nan.
Fold 1: Exception during automl search: Multiclass pipelines require y to have 3 or more unique classes!
Fold 1: Parameters:
{'Label Encoder': {'positive_label': None}, 'Imputer': {'categorical_impute_strategy': 'most_frequent', 'numeric_impute_strategy': 'mean', 'boolean_impute_strategy': 'most_frequent', 'categorical_fill_value': None, 'numeric_fill_value': None, 'boolean_fill_value': None}, 'One Hot Encoder': {'top_n': 10, 'features_to_encode': None, 'categories': None, 'drop': 'if_binary', 'handle_unknown': 'ignore', 'handle_missing': 'error'}, 'Random Forest Classifier': {'n_estimators': 100, 'max_depth': 6, 'n_jobs': -1}}
Fold 1: Traceback:
File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 238, in _train_and_score fitted_pipeline, hashes = train_pipeline(
File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 176, in train_pipeline cv_pipeline.fit(X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\utils\base_meta.py", line 19, in _set_fit return_value = method(self, X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\pipelines\classification_pipeline.py", line 66, in fit raise ValueError(
Random Forest Classifier w/ Label Encoder + Replace Nullable Types Transformer + Imputer + One Hot Encoder fold 2: Encountered an error.
Random Forest Classifier w/ Label Encoder + Replace Nullable Types Transformer + Imputer + One Hot Encoder fold 2: All scores will be replaced with nan.
Fold 2: Exception during automl search: Multiclass pipelines require y to have 3 or more unique classes!
Fold 2: Parameters:
{'Label Encoder': {'positive_label': None}, 'Imputer': {'categorical_impute_strategy': 'most_frequent', 'numeric_impute_strategy': 'mean', 'boolean_impute_strategy': 'most_frequent', 'categorical_fill_value': None, 'numeric_fill_value': None, 'boolean_fill_value': None}, 'One Hot Encoder': {'top_n': 10, 'features_to_encode': None, 'categories': None, 'drop': 'if_binary', 'handle_unknown': 'ignore', 'handle_missing': 'error'}, 'Random Forest Classifier': {'n_estimators': 100, 'max_depth': 6, 'n_jobs': -1}}
Fold 2: Traceback:
File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 238, in _train_and_score fitted_pipeline, hashes = train_pipeline(
File "D:\conda\envs\gradio\lib\site-packages\evalml\automl\engine\engine_base.py", line 176, in train_pipeline cv_pipeline.fit(X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\utils\base_meta.py", line 19, in _set_fit return_value = method(self, X, y)
File "D:\conda\envs\gradio\lib\site-packages\evalml\pipelines\classification_pipeline.py", line 66, in fit raise ValueError(
Random Forest Classifier w/ Label Encoder + Replace Nullable Types Transformer + Imputer + One Hot Encoder: Starting cross validation Finished cross validation - mean Log Loss Multiclass: nan
AutoMLSearchException Traceback (most recent call last) Cell In [20], line 1 ----> 1 automl.search(interactive_plot=False)
File D:\conda\envs\gradio\lib\site-packages\evalml\automl\automl_search.py:1159, in AutoMLSearch.search(self, interactive_plot) 1152 if ( 1153 len(current_batch_pipeline_scores) 1154 and current_batch_pipeline_scores.isna().all() 1155 ): 1156 error_msgs = set( 1157 [str(pl_fold["Exception"]) for pl_fold in self.errors.values()], 1158 ) -> 1159 raise AutoMLSearchException( 1160 f"All pipelines in the current AutoML batch produced a score of np.nan on the primary objective {self.objective}. Exception(s) raised: {error_msgs}. Check the 'errors' attribute of the AutoMLSearch object for a full breakdown of errors and tracebacks.", 1161 ) 1162 if len(pipeline_times) > 0: 1163 pipeline_times["Total time of batch"] = time_elapsed(start_batch_time)
AutoMLSearchException: All pipelines in the current AutoML batch produced a score of np.nan on the primary objective <evalml.objectives.standard_metrics.LogLossMulticlass object at 0x00000274FC966D00>. Exception(s) raised: {'Multiclass pipelines require y to have 3 or more unique classes!'}. Check the 'errors' attribute of the AutoMLSearch object for a full breakdown of errors and tracebacks.
# Your code here
y_train.dtypes
CategoricalDtype(categories=['<=5%', '>5%'], ordered=False)
automl = AutoMLSearch(
X_train=X_train,
y_train=y_train,
problem_type="multiclass",
verbose=True,
)
automl.search(interactive_plot=False)
I have encountered a similar problem. Have you solved it?
@glacierck Do you have some data for us to repro this?
I faced similar issue on the attached dataset depending on the train-test split random-state number. It works only when I use random_state=42 on my system. I tried it on Google Colab and didn't work at all. covid_flu.csv
from evalml import AutoMLSearch
import pandas as pd
from sklearn.impute import SimpleImputer
from sklearn.model_selection import train_test_split
data_path = 'data/covid_flu.csv'
target_column = 'Diagnosis'
data = pd.read_csv(data_path).dropna(subset=target_column)
problem_type = 'binary'
y = data[target_column]
X = data.drop(columns=target_column)
# Changing the random state to 42 will not give an error on my system
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=1, stratify=y)
# Impute numeric columns
num_imputer = SimpleImputer().set_output(transform='pandas')
num_cols = X_train.select_dtypes(include='number').columns
if len(num_cols)>0:
X_train[num_cols] = num_imputer.fit_transform(X_train[num_cols])
X_test[num_cols] = num_imputer.transform(X_test[num_cols])
# Impute categorical columns
cat_cols = X_train.select_dtypes(exclude='number').columns
if len(cat_cols)>0:
imp_mostfrequent = SimpleImputer(strategy='most_frequent')
print(cat_cols)
X_train[cat_cols] = imp_mostfrequent.fit_transform(X_train[cat_cols])
X_test[cat_cols] = imp_mostfrequent.transform(X_test[cat_cols])
automl = AutoMLSearch(
X_train=X_train,
y_train=y_train,
X_holdout=X_test,
y_holdout=y_test,
problem_type=problem_type,
objective='auto',
additional_objectives='f1',
allowed_model_families=["extra_trees", "linear_model", "random_forest", "lightgbm"],
max_batches=3,
automl_algorithm="default",
ensembling=True,
verbose=False,
)
automl.search(interactive_plot=False)
Thanks for your help!