Christian Lorentzen comments

Results 329 comments of


                                            Christian Lorentzen

Huber Loss for HistGradientBoostingRegressor

I would prefer to not implement Huber for HGBT. Mainly because it is not that clear, what a model minimizing Huber loss actually estimates/predicts, something between mean and median. Maybe,...

Huber Loss for HistGradientBoostingRegressor

Can you detect the outliers? Have you tried `HistGradientBoostingRegressor(loss="quantile", quantile=0.5)` to estimate the median instead of the mean? For this kind of discussion, training time is only of second interest...

Huber Loss for HistGradientBoostingRegressor

Could you provide some information about your dataset(s)? How many samples (rows)? How many features (columns)? When you say L1 and L2 evaluation metrics are often better with a model...

Huber Loss for HistGradientBoostingRegressor

> reasons against it It adds an additional step of estimating the Huber loss parameter `delta` at each iteration. What is the advantage of HGBT over GBT? - Use of...

Huber Loss for HistGradientBoostingRegressor

>> When you say L1 and L2 evaluation metrics are often better with a model minimizing Huber loss, is it on a separate hold-out/test set or on the training set?...

Huber Loss for HistGradientBoostingRegressor

I guess we could add Huber loss. As was already said, it would make HGBT more consistent with GBT. @scikit-learn/core-devs What do you think about inclusion? Still, I would personally...

Huber Loss for HistGradientBoostingRegressor

> How big is the downside of not having hessians? In HGBT, we already have quantile/pinball loss which is not smooth. Not having hessians means one needs an explicit line...

Huber Loss for HistGradientBoostingRegressor

I did no dedicated benchmarking of timings. Usually, estimating quantiles takes longer, noticable for users. I am really critical of smoothed versions of losses. A loss function in (statistical) estimation...

TST use global_random_seed in `sklearn/linear_model/tests/test_logistic.py`

@DeaMariaLeon Please note that `LogisticRegression` makes use of `random_state` only when `solver == ‘sag’, ‘saga’ or ‘liblinear’` (to shuffle the data).