Nicolas Hug

Results 102 issues of Nicolas Hug

Instead of summing over all the bins of an arbitrary feature histogram to compute `context.sum_gradients` and `context.sum_hessians`, we can instead directly pass those values to `find_node_split_subtraction` since the parent's `split_info`...

perf

I think we can drop the `constant_hessian_value` in the `SplittingContext`, and always assume the constant hessians value is `1`. We just have to set the gradients value accordingly, to have...

A new grower (and a new SplittingContext) is created at each iteration which may cause a memory use peak on large datasets (#79). Instead of instanciating a new grower, we...

perf

https://github.com/numba/numba/issues/3554 was fixed so we can remove our temporary fix from #51 once the next version is released. Places to fix (ATM): ``` ~/dev/pygbm » ag "array\[:0\] will" pygbm/splitting.py 250:...

Slightly related to #76 This is the second bullet point from https://github.com/ogrisel/pygbm/issues/69#issue-391170726 When early stopping (or just score monitoring) is done on the training data with the loss, we should...

perf

As mentioned in #75, it'd be nice to allow score monitoring (both scoring and loss values on train / validation data) regardless of early stopping.

enhancement

Opened https://github.com/numba/numba/issues/3588 to ask if there's an alternative.

perf

Results are comparable to LightGBM when `n_samples` `n_bins`. In particular, on this very easy dataset (`target = X[:, 0] > 0`, lightgbm finds a perfect threshold of `1e-35` while that...

Following #214, there might be some numerical instability in pearson similarity computation, probably caused by the `sqrt` function receiving either negative values, NaN or infinity. I couldn't reproduce the issue...

help wanted