xgboost
xgboost copied to clipboard
Optimization/apply split/column matrix
This is a part of https://github.com/dmlc/xgboost/pull/7208 . I tried to separate changes into the independent pull-requests for simplification of the reviewing. In the current one only the changes in ColumnMatrix are implemented.
We are trying to unblock the CI, will keep you posted.
Dear @trivialfis , thanks for the review and comments.
This block of changes is the part of the optimization proposed in https://github.com/dmlc/xgboost/pull/7208. @ShvetsKS is busy in other projects, so I will help to adapt the code to the acceptable view.
One of the main problem with the original PR https://github.com/dmlc/xgboost/pull/7208 is high amount of changing code in one PR. For this reason I separated one significant block of logic in this PR for simplification of reviewing and more easy maintaining.
After this PR I plan to refactor the optimized partition builder from the original PR https://github.com/dmlc/xgboost/pull/7208. So the presented PR should be considered only as a part of a much more complicated optimization.
Apologies for the confusion and delay. I appreciate the effort in making smaller PRs. Would you like to have a chat over zoom or any other platforms? I'm a little bit confused by the changes.
Dear @trivialfis , it is a good idea. We use Telegram for conversations. I can add you to the chat if it is suitable for you, if not we can use zoom.
@razdoburdin Thank you, please add @RAMitchell as well.
@razdoburdin Thank you, please add @RAMitchell as well.
I sent the invitation to the email from your github profiles
In my opinion this PR should not be merged. It looks like unrelated pieces of code copy pasted from the original PR.
It's definitely possible to implement the desired improvements by gradual refactor, where each part improves existing code and makes sense. Other contributors have been able to do this for very complicated changes, even with most PRs under 200 lines.