FEDOT icon indicating copy to clipboard operation
FEDOT copied to clipboard

[Bug]: Invalid fitness after objective evaluation. Skipping the graph: (/n_scaling;)/n_rf_{'n_jobs':32}

Open DRMPN opened this issue 11 months ago • 1 comments

Expected Behavior

The method calculates the Roc Auc score for a target column of type bool in a tabular data classification problem.

Current Behavior

For some reason get_metrics() method fails after successfully fitting the data. image

There seems to be a problem with the tabular data preprocessing in the target, surprisingly it's an empty array. image

This is likely just the tip of the iceberg, as the `num_classes' method is already parameterized with an empty array: image

Possible Solution

  • Use debugging to find the place in the source code where this transformation does not take place.
  • Make appropriate changes to the code and check that the proposed solution works correctly.
  • Prepare unit tests.

Steps to Reproduce

  1. Download the data from https://www.kaggle.com/competitions/spaceship-titanic
  2. Create and run Jupyter Notebook using the following snippet (imports and paths are omitted for simplicity):
train = pd.read_csv("train.csv")
test = pd.read_csv("test.csv")
model = Fedot(problem='classification', metric='roc_auc', preset="best_quality")
best_pipeline = model.fit(features=train, target='Transported')
prediction = model.predict(features=test)
model.plot_prediction()
model.get_metrics()

Context [OPTIONAL]

The target column is loaded as a bool type. image

Documentation References: https://fedot.readthedocs.io/en/latest/introduction/fedot_features/main_features.html https://fedot.readthedocs.io/en/latest/advanced/data_preprocessing.html

DRMPN avatar Mar 13 '24 10:03 DRMPN

Related problem: https://github.com/aimclub/FEDOT/pull/1274

nicl-nno avatar Mar 13 '24 10:03 nicl-nno