Laurae comments

Results 36 comments of


                                            Laurae

GPU performance for Quadro P1000 (and 4 GPUs)

Newer LightGBM GPU results using not all threads, restricting to 1 NUMA node and physical cores, excluding histogram building time (negligible, 0.04x seconds for 0.1m,, 0.1xx seconds for 1m, 1.180s...

GPU performance for Quadro P1000 (and 4 GPUs)

@trivialfis We are not passing any watchlist, only the objective for gradient / hessian is computed.

GPU performance for Quadro P1000 (and 4 GPUs)

CUDA LightGBM: some results from myself here: https://gist.github.com/Laurae2/7195cebe65887907a06e9118a3ec7f96 (**VERY experimental**) Using commit Microsoft/LightGBM@df37bce (25 Sept 2020). GPU usage increases as lower number of GPU is used (ex: 80% for 1...

lightgbm: better matching hyperparams

@szilard `max_leaves = 1024` builds very deep trees. With 32 leaves it should already build deeper trees than `max_depth = 10` in most cases. `max_depth` was available before LightGBM GPU...

lightgbm: better matching hyperparams

@szilard Try this: https://github.com/Microsoft/LightGBM/blob/master/docs/Key-Events.md

CPU Single threaded performance

Hardware/Software: https://github.com/szilard/GBM-perf/issues/12 hist xgboost, 1 model: | Size | Time (1T) | Time (9T) | Time (18T) | Time (36T) | Time (70T) | | ---: | ---: | ---:...

aGTBoost

Dumb "tuning" abusing knowledge of only 1 feature is relevant allows us to choose a proper more maximum depth of 2 instead of an absurd 10, which leads us out...

as.caretList method for lists of train models

Assuming the sample strategy is the same (and using probs models), it is possible to combine train models into a caretList. The creation of indexes must be stored into a...

Thread scalability is suboptimal

@ogrisel You can apply for a free Intel VTune license for profiling your code if you do research. It will be much better than the numba profiler.

Thread scalability is suboptimal

@stuartarchibald You can use the numba profiler here: https://github.com/numba/data_profiler (it just adds the signatures in reality). Incurs overhead penalty. Still better to use Intel VTune for real profiling though (way...