xgboost-survival-embeddings
xgboost-survival-embeddings copied to clipboard
lifelines.exceptions.ConvergenceError
When running a XGBSEStackedWeibull
, I get a lifelines.exceptions.ConvergenceError
with the following message:
lifelines.exceptions.ConvergenceError: Fitting did not converge. Try the following:
0. Are there any lifelines warnings outputted during the `fit`?
1. Inspect your DataFrame: does everything look as expected?
2. Try scaling your duration vector down, i.e. `df[duration_col] = df[duration_col]/100`
3. Is there high-collinearity in the dataset? Try using the variance inflation factor (VIF) to find redundant variables.
4. Try using an alternate minimizer: ``fitter._scipy_fit_method = "SLSQP"``.
5. Trying adding a small penalizer (or changing it, if already present). Example: `WeibullAFTFitter(penalizer=0.01).fit(...)`.
6. Are there any extreme outliers? Try modeling them or dropping them to see if it helps convergence.
Given the pipeline nature of XGBSEStackedWeibull
. Are there recommended steps to getting past the convergence error? I.E. Will the lifelines recommendations still hold, or are there other methods I should try?
You can follow lifelines recommendations, except for 3. Is there high-collinearity in the dataset? Try using the variance inflation factor (VIF) to find redundant variables.
as it shouldn't make a difference, since we are only using hazard predicted on xgboost as feature for fitting the lifelines WeibullAFT model.
I've tried a few of the fixes listed above. Interestingly, when I use 2.
, scaling the duration vector down, I get vastly accelerated survival curves. Is there a second change I have to run to un-scale my predictions from a survival curve? Or are the predictions of the curve scale-invariant?
Should the time-bins used by XGBSE also be scaled?