autogluon
autogluon copied to clipboard
[BUG] TabularPredictor fails with fit_extra for regression
- [✓ ] I have checked that this bug exists on the latest stable version of AutoGluon
- [ ✓] and/or I have checked that this bug exists on the latest mainline of AutoGluon via source installation
Describe the bug Tabular predictor with fit_extra for regression fails with TypeError: '>' not supported between instances of 'float' and 'NoneType' I've put the complete script to reproduce and the complete log. predictor.fit_extra(custom_hyperparameters,time_limit=120)
To mitigate i tried adding the below code to abstract_trainer.py and it worked but i'm not sure what the right fix is . The idea if any of the models are not initialized due to time_limit or other reasons, we cannot compare it. Only when best_score and cur_score are initialized and valid we compare them otherwise skip the current model. This error message seems to have occurred previously for high_quality preset and there was a fix for this 1 or 2 years back but the changes are completely different.
diff --git a/core/src/autogluon/core/trainer/abstract_trainer.py b/core/src/autogluon/core/trainer/abstract_trainer.py
index f8d55745..38576557 100644
--- a/core/src/autogluon/core/trainer/abstract_trainer.py
+++ b/core/src/autogluon/core/trainer/abstract_trainer.py
@@ -1070,6 +1070,10 @@ class AbstractTrainer:
else:
best_score = self.get_model_attribute(self.model_best, 'val_score')
cur_score = self.get_model_attribute(weighted_ensemble_model_name, 'val_score')
+ if best_score is None:
+ continue
+ if cur_score is None:
+ continue
if cur_score > best_score:
# new best model
self.model_best = weighted_ensemble_model_name
Expected behavior fit_extra shouldn't crash
To Reproduce
from autogluon.tabular import TabularDataset, TabularPredictor
from random import uniform
train_data = TabularDataset('https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv') # can be local CSV file as well, returns Pandas DataFrame
label = 'class'
# remove the classification label
train_data.drop(columns=['class'])
#add the regression labels containing 0 - 100 as floats with 2 decimal places
train_data['class'] = [round(uniform(0, 100), 2) for i in range(len(train_data))]
from autogluon.tabular.configs.hyperparameter_configs import get_hyperparameter_config
save_path = 'agModels11' # specifies folder to store trained models
custom_hyperparameters = get_hyperparameter_config('default')
#works fine
predictor = TabularPredictor(label=label, path=save_path).fit(train_data,presets='high_quality',time_limit=110,hyperparameters=custom_hyperparameters)
#load the saved model for further training
predictor = TabularPredictor.load(save_path)
#fails with
# ... lot of stack
# File "/home/buggluon/autogluon/core/src/autogluon/core/trainer/abstract_trainer.py", line 525, in stack_new_level_aux
# return self.generate_weighted_ensemble(X=X_stack_preds, y=y,
# File "/home/buggluon/autogluon/core/src/autogluon/core/trainer/abstract_trainer.py", line 1073, in generate_weighted_ensemble
# if cur_score > best_score:
#TypeError: '>' not supported between instances of 'float' and 'NoneType'
predictor.fit_extra(custom_hyperparameters,time_limit=120)
Screenshots If applicable, add screenshots to help explain your problem.
Installed Versions
Which version of AutoGluon are you are using?
If you are using 0.4.0 and newer, please run the following code snippet:
# Replace this code with the output of the following:
from autogluon.core.utils import show_versions
show_versions()
INSTALLED VERSIONS
------------------
date : 2022-07-28
time : 13:41:38.419504
python : 3.8.10.final.0
OS : Linux
OS-release : 5.4.0-88-generic
Version : #99-Ubuntu SMP Thu Sep 23 17:29:00 UTC 2021
machine : x86_64
processor : x86_64
num_cores : 4
cpu_ram_mb : 7632
cuda version : None
num_gpus : 0
gpu_ram_mb : []
avail_disk_size_mb : 24959
autogluon.common : 0.5.2b20220726
autogluon.core : 0.5.2b20220726
autogluon.features : 0.5.2b20220726
autogluon.multimodal : 0.5.2b20220726
autogluon.tabular : 0.5.2b20220726
autogluon.text : 0.5.2b20220726
autogluon.timeseries : None
autogluon.vision : 0.5.2b20220726
autogluon_contrib_nlp : None
boto3 : 1.24.37
catboost : 1.0.6
dask : 2021.11.2
distributed : 2021.11.2
fairscale : 0.4.7
fastai : 2.7.7
gluoncv : 0.11.0
hyperopt : 0.2.7
lightgbm : 3.3.2
matplotlib : 3.1.2
networkx : 2.4
nlpaug : 1.1.10
nltk : 3.7
nptyping : 1.4.4
numpy : 1.21.6
omegaconf : 2.1.2
pandas : 1.4.3
PIL : 9.0.1
protobuf : None
psutil : 5.8.0
pytorch-metric-learning: None
pytorch_lightning : 1.6.5
ray : 1.13.0
requests : 2.22.0
scipy : 1.7.3
sentencepiece : None
skimage : 0.19.3
sklearn : 1.0.2
smart_open : 5.2.1
timm : 0.5.4
torch : 1.12.0+cu113
torchmetrics : 0.7.3
torchtext : 0.13.0
torchvision : 0.13.0+cu113
tqdm : 4.64.0
transformers : 4.20.1
xgboost : 1.4.2
Additional context Complete Log message
Presets specified: ['high_quality']
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
Beginning AutoGluon training ... Time limit = 110s
AutoGluon will save models to "agModels11/"
AutoGluon Version: 0.5.2b20220726
Python Version: 3.8.10
Operating System: Linux
Train Data Rows: 39073
Train Data Columns: 14
Label Column: class
Preprocessing data ...
AutoGluon infers your prediction problem is: 'regression' (because dtype of label-column == float and many unique label-values observed).
Label info (max, min, mean, stddev): (100.0, 0.0, 49.95621, 28.80265)
If 'regression' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 3104.94 MB
Train Data (Original) Memory Usage: 22.92 MB (0.7% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 1 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('int', []) : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
('object', []) : 8 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
Types of features in processed data (raw dtype, special dtypes):
('category', []) : 7 | ['workclass', 'education', 'marital-status', 'occupation', 'relationship', ...]
('int', []) : 6 | ['age', 'fnlwgt', 'education-num', 'capital-gain', 'capital-loss', ...]
('int', ['bool']) : 1 | ['sex']
0.8s = Fit runtime
14 features in original data used to generate 14 features in processed data.
Train Data (Processed) Memory Usage: 2.19 MB (0.1% of available memory)
Data preprocessing and feature engineering runtime = 0.86s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 72.73s of the 109.13s of remaining time.
-31.5505 = Validation score (-root_mean_squared_error)
0.27s = Training runtime
1.55s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 70.71s of the 107.1s of remaining time.
-32.6513 = Validation score (-root_mean_squared_error)
0.6s = Training runtime
1.69s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 68.17s of the 104.56s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-28.8001 = Validation score (-root_mean_squared_error)
28.94s = Training runtime
0.43s = Validation runtime
Fitting model: LightGBM_BAG_L1 ... Training model for up to 14.26s of the 50.65s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-28.8028 = Validation score (-root_mean_squared_error)
37.15s = Training runtime
0.53s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 109.12s of the -2.92s of remaining time.
-28.8001 = Validation score (-root_mean_squared_error)
4.74s = Training runtime
0.0s = Validation runtime
Fitting 9 L2 models ...
Completed 1/20 k-fold bagging repeats ...
No base models to train on, skipping auxiliary stack level 3...
AutoGluon training complete, total runtime = 118.14s ... Best model: "WeightedEnsemble_L2"
Fitting model: KNeighborsUnif_BAG_L1_FULL | Skipping fit via cloning parent ...
0.27s = Training runtime
1.55s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1_FULL | Skipping fit via cloning parent ...
0.6s = Training runtime
1.69s = Validation runtime
Fitting 1 L1 models ...
Fitting model: LightGBMXT_BAG_L1_FULL ...
5.18s = Training runtime
Fitting 1 L1 models ...
Fitting model: LightGBM_BAG_L1_FULL ...
1.6s = Training runtime
Fitting model: WeightedEnsemble_L2_FULL | Skipping fit via cloning parent ...
4.74s = Training runtime
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("agModels11/")
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_2_BAG_L1 ... Training model for up to 120.0s of the 119.99s of remaining time.
-31.5505 = Validation score (-root_mean_squared_error)
0.74s = Training runtime
1.82s = Validation runtime
Fitting model: KNeighborsDist_2_BAG_L1 ... Training model for up to 117.17s of the 117.17s of remaining time.
-32.6513 = Validation score (-root_mean_squared_error)
0.69s = Training runtime
1.64s = Validation runtime
Fitting model: LightGBMXT_2_BAG_L1 ... Training model for up to 114.61s of the 114.6s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-28.8001 = Validation score (-root_mean_squared_error)
49.26s = Training runtime
0.75s = Validation runtime
Fitting model: LightGBM_2_BAG_L1 ... Training model for up to 43.33s of the 43.32s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-28.8028 = Validation score (-root_mean_squared_error)
47.47s = Training runtime
0.6s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_2_L2 ... Training model for up to 120.0s of the -27.98s of remaining time.
-28.8001 = Validation score (-root_mean_squared_error)
2.28s = Training runtime
0.0s = Validation runtime
best_score None
cur_score -28.800071683030545
Traceback (most recent call last):
File "/home/buggluon/autogluon/examples/tabular/nojupiter_predict.py", line 31, in <module>
predictor.fit_extra(custom_hyperparameters,time_limit=120)
File "/home/buggluon/autogluon/tabular/src/autogluon/tabular/predictor/predictor.py", line 1092, in fit_extra
fit_models = self._trainer.train_multi_levels(
File "/home/buggluon/autogluon/core/src/autogluon/core/trainer/abstract_trainer.py", line 295, in train_multi_levels
base_model_names, aux_models = self.stack_new_level(
File "/home/buggluon/autogluon/core/src/autogluon/core/trainer/abstract_trainer.py", line 409, in stack_new_level
aux_models = self.stack_new_level_aux(X=X, y=y, base_model_names=core_models, level=level+1,
File "/home/buggluon/autogluon/core/src/autogluon/core/trainer/abstract_trainer.py", line 525, in stack_new_level_aux
return self.generate_weighted_ensemble(X=X_stack_preds, y=y,
File "/home/buggluon/autogluon/core/src/autogluon/core/trainer/abstract_trainer.py", line 1075, in generate_weighted_ensemble
if cur_score > best_score:
TypeError: '>' not supported between instances of 'float' and 'NoneType'
Thanks for reporting! This is indeed a bug. You can bypass this bug by not using high_quality / refit_full and then calling fit_extra, then you can call refit_full on the final result. (aka use best_quality, then when you are done calling fit_extra you can call predictor.refit_full()
to get the same result as if the bug didn't exist.
Will plan to fix this in the next release.
Thanks for reporting! This is indeed a bug. You can bypass this bug by not using high_quality / refit_full and then calling fit_extra, then you can call refit_full on the final result. (aka use best_quality, then when you are done calling fit_extra you can call
predictor.refit_full()
to get the same result as if the bug didn't exist.Will plan to fix this in the next release.
I found that predictor.refit_full() decreases accuracy. I do a fit with time_limit, then fit_extra with time_limit and then a refit_full. Is it true or am i doing something wrong ?
Can you save the state of autogluon after timelimit and resume the training ? Say for example i want to train for 48 hours but i can't run my training for 48 hours continuously. Is there a way like after 24 hours training on friday i can pause the training and save it and them resume it again on Monday ?
If my default hyperparameters and timelimit are same for both fit and fit_extra , then there is duplication of models which wastes training time.
`` After fit
model | score_test | score_val | pred_time_test | pred_time_val | fit_time | pred_time_test_marginal | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order |
---|---|---|---|---|---|---|---|---|---|---|---|
KNeighborsDist_BAG_L1 | 1.000000 | -0.091291 | 0.398728 | 0.367611 | 0.056612 | 0.398728 | 0.367611 | 0.056612 | 1 | True | 2 |
RandomForestMSE_BAG_L1 | 0.975580 | 0.824212 | 3.178744 | 2.205708 | 44.445011 | 3.178744 | 2.205708 | 44.445011 | 1 | True | 5 |
ExtraTreesMSE_BAG_L1 | 0.973658 | 0.826616 | 3.296700 | 2.009215 | 18.024629 | 3.296700 | 2.009215 | 18.024629 | 1 | True | 7 |
``
results after fit_extra , i have duplicate models
model | score_test | score_val | pred_time_test | pred_time_val | fit_time | pred_time_test_marginal | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order |
---|---|---|---|---|---|---|---|---|---|---|---|
KNeighborsDist_BAG_L1 | 1.000000 | -0.091291 | 0.445786 | 0.367611 | 0.056612 | 0.445786 | 0.367611 | 0.056612 | 1 | True | 2 |
KNeighborsDist_2_BAG_L1 | 1.000000 | -0.091291 | 0.448205 | 0.372300 | 0.058094 | 0.448205 | 0.372300 | 0.058094 | 1 | True | 24 |
RandomForestMSE_2_BAG_L1 | 0.975580 | 0.824212 | 3.970681 | 1.952887 | 42.525009 | 3.970681 | 1.952887 | 42.525009 | 1 | True | 27 |
RandomForestMSE_BAG_L1 | 0.975580 | 0.824212 | 5.436502 | 2.205708 | 44.445011 | 5.436502 | 2.205708 | 44.445011 | 1 | True | 5 |
ExtraTreesMSE_BAG_L1 | 0.973658 | 0.826616 | 3.205769 | 2.009215 | 18.024629 | 3.205769 | 2.009215 | 18.024629 | 1 | True | 7 |
ExtraTreesMSE_2_BAG_L1 | 0.973658 | 0.826616 | 3.509352 | 2.066019 | 16.863691 | 3.509352 | 2.066019 | 16.863691 | 1 | True | 29 |
``
I found that predictor.refit_full() decreases accuracy. I do a fit with time_limit, then fit_extra with time_limit and then a refit_full. Is it true or am i doing something wrong ?
Yes, refit_full reduces accuracy but speeds up inference.
https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-quickstart.html#presets
Best = best accuracy High = same as Best, but refit for lower accuracy but faster inference
Can you save the state of autogluon after timelimit and resume the training ? Say for example i want to train for 48 hours but i can't run my training for 48 hours continuously. Is there a way like after 24 hours training on friday i can pause the training and save it and them resume it again on Monday ? If my default hyperparameters and timelimit are same for both fit and fit_extra , then there is duplication of models which wastes training time.
You cannot resume training via the same config you initially used. You will need to investigate which models were trained and adjust your hyperparameters in the fit_extra call. This is not easy to automate for a variety of reasons, especially with time limit involved as AutoGluon dynamically changes its strategy at multiple stages of training based on time limit, and it is hard to recover the state of that. It may be added eventually, but isn't there yet.
@navinp0304 This bug has been fixed in mainline, and will be available in the upcoming v0.6 release. Thanks again for reporting!
You cannot resume training via the same config you initially used. You will need to investigate which models were trained and adjust your hyperparameters in the fit_extra call. This is not easy to automate for a variety of reasons, especially with time limit involved as AutoGluon dynamically changes its strategy at multiple stages of training based on time limit, and it is hard to recover the state of that. It may be added eventually, but isn't there yet.
Hello. @Innixma . Is there some guidelines or instructions to help set hyperparameters correctly in fit_extra? For example, I excluded NN_TORCH during fit, and program crashed after finished training the L1 models. How should I continue my L2 training? Thank you very much.