Guolin Ke
Guolin Ke
@MaxHalford Thanks very much. Actually, LightGBM is able to convert tree model to c++ codes. However, the tree in lightgbm is not the balance tree, so it is not trivial...
Thank you @julioasotodv . Actually, the CUDA implementation in existing tools already integrates most "Histogram tricks" from LightGBM. So it is more like the "reinvert the wheel" if we just...
@MaxHalford , The current prediction logic is very similar to that fashion, it is https://github.com/microsoft/LightGBM/blob/9b2637354a84e0da813f8e2b22608a87eee01e4c/include/LightGBM/tree.h#L572-L584 .
@MaxHalford i quick through the PR. I think it is related to python-side histogram implementation, and our cpp implementation is already optimized for both speed and memory. And we have...
@ogrisel the sequential mode is `num_thread=1`? Did you test the LightGBM 3.0.0 ? We have several speed optimizations in the 3.0.0 release. In particular, LightGBM also implement row-wise histogram algorithm...
awesome! sounds good to me
Sorry, I don't remember about that 🤣, I think it was not intentional. We can align with XGBoost.
can you provide the settings? If you use the per-sample gradient clipping, the batch size is limited to 1.
@zhangxiaaobo the NCCL support is built-in by PyTorch, not by Uni-Core. Uni-Core itself is a wrapper for PyTorch. Both c10d and no_c10d are not related NCCL, they are different kinds...
you need to install uni-core first. https://github.com/dptech-corp/Uni-Core