Histogram-based Gradient Boosting survival models
It would be great to have Histogram-based Gradient Boosting models on top of normal ones as it is much more scalable : They are supported by scikit-learn:
- https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.HistGradientBoostingRegressor.html#sklearn.ensemble.HistGradientBoostingRegressor
It can also support missing values, which AFAICT only RandomSurvivalForests in sksurv can handle missing values
It seems to be possible by sub-classing BaseHistGradientBoosting
https://github.com/scikit-learn/scikit-learn/blob/70fdc843a4b8182d97a3508c1a426acc5e87e980/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py#L140
and passing a new implementation of BaseLoss as loss argument
https://github.com/scikit-learn/scikit-learn/blob/70fdc843a4b8182d97a3508c1a426acc5e87e980/sklearn/_loss/loss.py#L67
BaseLoss ultimately calls a subclass of CyLossFunction
https://github.com/scikit-learn/scikit-learn/blob/70fdc843a4b8182d97a3508c1a426acc5e87e980/sklearn/_loss/_loss.pxd#L24
The original PR to add histogram-based gradient boosting in scikit-learn is https://github.com/scikit-learn/scikit-learn/pull/12807