LightGBM icon indicating copy to clipboard operation
LightGBM copied to clipboard

Avoid copy on Refit

Open cbourjau opened this issue 1 year ago • 3 comments

The C-API exposes LGBM_BoosterRefit which receives the predicted leaf indices as a flat buffer that is laid out in the shape of nrow x ncol. The Boosting::RefitTree function, on the other hand, expects these indices as a nested vector object (std::vector<std::vector<int>>). Creating this nested object requires various additional allocations and amounts to an entire copy of the initial buffer doubling the memory requirements.

This PR changes the API of Boosting::RefitTree to take a pointer to a flat buffer of int32s, just like the C-API, thus avoiding the copy. This assumes that this part of the API is not regarded as stable. The C-API remains unchanged.

cbourjau avatar Jun 12 '24 15:06 cbourjau

[trigger ci]

Every CI run on this PR will require a maintainer manually approving it, because you've never contributed here before. Sorry for the inconvenience, but GitHub introduced this as a security measure a few years ago and we've decided to leave it enabled. We do occasionally receive malicious pull requests trying to use our CI resources 🙃

jameslamb avatar Jun 13 '24 21:06 jameslamb

Thanks for the feedback! Some CI does appear to run, though, and some of it exhibited IO failures the first time around which I was trying to address 🤷 .

cbourjau avatar Jun 14 '24 08:06 cbourjau

Thanks @cbourjau! AppVeyor CI will be fixed by https://github.com/microsoft/LightGBM/pull/6490, that only leaves the linting job 😁

borchero avatar Jun 18 '24 00:06 borchero