xgboost icon indicating copy to clipboard operation
xgboost copied to clipboard

Optimization/apply split/column matrix

Open razdoburdin opened this issue 2 years ago • 7 comments

This is a part of https://github.com/dmlc/xgboost/pull/7208 . I tried to separate changes into the independent pull-requests for simplification of the reviewing. In the current one only the changes in ColumnMatrix are implemented.

razdoburdin avatar Jun 14 '22 15:06 razdoburdin

We are trying to unblock the CI, will keep you posted.

trivialfis avatar Jun 17 '22 10:06 trivialfis

Dear @trivialfis , thanks for the review and comments.

This block of changes is the part of the optimization proposed in https://github.com/dmlc/xgboost/pull/7208. @ShvetsKS is busy in other projects, so I will help to adapt the code to the acceptable view.

One of the main problem with the original PR https://github.com/dmlc/xgboost/pull/7208 is high amount of changing code in one PR. For this reason I separated one significant block of logic in this PR for simplification of reviewing and more easy maintaining.

After this PR I plan to refactor the optimized partition builder from the original PR https://github.com/dmlc/xgboost/pull/7208. So the presented PR should be considered only as a part of a much more complicated optimization.

razdoburdin avatar Jun 23 '22 07:06 razdoburdin

Apologies for the confusion and delay. I appreciate the effort in making smaller PRs. Would you like to have a chat over zoom or any other platforms? I'm a little bit confused by the changes.

trivialfis avatar Jul 02 '22 18:07 trivialfis

Dear @trivialfis , it is a good idea. We use Telegram for conversations. I can add you to the chat if it is suitable for you, if not we can use zoom.

razdoburdin avatar Jul 04 '22 14:07 razdoburdin

@razdoburdin Thank you, please add @RAMitchell as well.

trivialfis avatar Jul 05 '22 09:07 trivialfis

@razdoburdin Thank you, please add @RAMitchell as well.

I sent the invitation to the email from your github profiles

razdoburdin avatar Jul 05 '22 09:07 razdoburdin

In my opinion this PR should not be merged. It looks like unrelated pieces of code copy pasted from the original PR.

It's definitely possible to implement the desired improvements by gradual refactor, where each part improves existing code and makes sense. Other contributors have been able to do this for very complicated changes, even with most PRs under 200 lines.

RAMitchell avatar Jul 22 '22 12:07 RAMitchell