tpot icon indicating copy to clipboard operation
tpot copied to clipboard

XGBoost Crash - anything to be done?

Open DGaffney opened this issue 10 months ago • 0 comments

Ran a test with a dataset today and at about 30% completion had some sort of hard crash - any tips on how to avoid this sort of thing? Worst case, should I just increase verbosity to see the best models as they print out?

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

xgb_config = {
    'xgboost.XGBRegressor': {
        'n_estimators': range(200, 2500, 200),  # Between 200 and 2500 trees
        'learning_rate': np.linspace(0.005, 0.2, 10),  # Learning rate from 0.005 to 0.2
        'max_depth': range(3, 12, 1),  # Depth from 3 to 12
        'min_child_weight': range(1, 20, 2),  # Minimum sum of weights in a leaf
        'subsample': np.linspace(0.5, 1.0, 6),  # Subsampling between 50% and 100%
        'colsample_bytree': np.linspace(0.5, 1.0, 6),  # Column sampling
        'gamma': np.linspace(0, 0.5, 6),  # Gamma for pruning
        'reg_alpha': np.logspace(-4, 1, 6),  # L1 regularization
        'reg_lambda': np.logspace(-4, 1, 6),  # L2 regularization
        'device': ['cuda'],
        'tree_method': ['hist'],
        'n_jobs': [10]  # Use all CPU cores
    }
}

# Initialize TPOT with expanded XGBoost hyperparameter search
tpot = TPOTRegressor(
    generations=20,  # Number of generations (increase for better tuning)
    population_size=50,  # Population size (larger values improve exploration)
    verbosity=2,
    config_dict=xgb_config,  # Use only XGBoost with hyperparameter tuning
    n_jobs=2
)

tpot.fit(X_train[:5000], y_train[:5000])
>>> tpot.fit(X_train[:5000], y_train[:5000])
is_classifier
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:1230: FutureWarning: passing a class to None is deprecated and will be removed in 1.8. Use an instance of the class instead.
  warnings.warn(

is_regressor
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:1270: FutureWarning: passing a class to None is deprecated and will be removed in 1.8. Use an instance of the class instead.
  warnings.warn(
Optimization Progress:   0%|                                                                                                                                                                                                                                | 0/1050 [00:00<?, ?pipeline/s]Optimization Progress:   6%|_____________                                                                                                                                                                                                       | 61/1050 [58:18<17:23:28, 63.30s/pipeline]Optimization Progress:   7%|______________                                                                                                                                                                                                    | 69/1050 [1:04:36<15:35:33, 57.22s/pipeline]                                                                                                                                                                                                                                                                                           
Generation 1 - Current best internal CV score: -0.2588038376471536
                                                                                                                                                                                                                                                                                           
Generation 2 - Current best internal CV score: -0.2588038376471536
                                                                                                                                                                                                                                                                                           
Generation 3 - Current best internal CV score: -0.2583334021699287
                                                                                                                                                                                                                                                                                           
Generation 4 - Current best internal CV score: -0.2583334021699287
                                                                                                                                                                                                                                                                                           
Generation 5 - Current best internal CV score: -0.25807251507673135
                                                                                                                                                                                                                                                                                           
Generation 6 - Current best internal CV score: -0.25589519236631997
Optimization Progress:  34%|_______________________________________________________________________                                                                                                                                          | 353/1050 [5:37:53<11:20:12, 58.55s/pipeline]Exception ignored on calling ctypes callback function: <bound method DataIter._next_wrapper of <xgboost.data.SingleBatchInternalIter object at 0x76285c795a50>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/xgboost/core.py", line 582, in _next_wrapper
    def _next_wrapper(self, this: None) -> int:  # pylint: disable=unused-argument
stopit.utils.TimeoutException: 
corrupted size vs. prev_size
Aborted (core dumped)
root@d80360eab95d:/workspace/api# 
root@d80360eab95d:/workspace/api# /usr/local/lib/python3.10/dist-packages/joblib/externals/loky/backend/resource_tracker.py:314: UserWarning: resource_tracker: There appear to be 8 leaked semlock objects to clean up at shutdown
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/joblib/externals/loky/backend/resource_tracker.py:314: UserWarning: resource_tracker: There appear to be 44 leaked folder objects to clean up at shutdown
  warnings.warn(

root@d80360eab95d:/workspace/api# 

DGaffney avatar Feb 16 '25 00:02 DGaffney