Christian Lorentzen

Results 329 comments of Christian Lorentzen

I would prefer to not implement Huber for HGBT. Mainly because it is not that clear, what a model minimizing Huber loss actually estimates/predicts, something between mean and median. Maybe,...

Can you detect the outliers? Have you tried `HistGradientBoostingRegressor(loss="quantile", quantile=0.5)` to estimate the median instead of the mean? For this kind of discussion, training time is only of second interest...

Could you provide some information about your dataset(s)? How many samples (rows)? How many features (columns)? When you say L1 and L2 evaluation metrics are often better with a model...

> reasons against it It adds an additional step of estimating the Huber loss parameter `delta` at each iteration. What is the advantage of HGBT over GBT? - Use of...

>> When you say L1 and L2 evaluation metrics are often better with a model minimizing Huber loss, is it on a separate hold-out/test set or on the training set?...

I guess we could add Huber loss. As was already said, it would make HGBT more consistent with GBT. @scikit-learn/core-devs What do you think about inclusion? Still, I would personally...

> How big is the downside of not having hessians? In HGBT, we already have quantile/pinball loss which is not smooth. Not having hessians means one needs an explicit line...

I did no dedicated benchmarking of timings. Usually, estimating quantiles takes longer, noticable for users. I am really critical of smoothed versions of losses. A loss function in (statistical) estimation...

@DeaMariaLeon Please note that `LogisticRegression` makes use of `random_state` only when `solver == ‘sag’, ‘saga’ or ‘liblinear’` (to shuffle the data).