xgboost_ray icon indicating copy to clipboard operation
xgboost_ray copied to clipboard

Add zero-copy DMatrix creation with Arrow

Open Yard1 opened this issue 3 years ago • 2 comments

We are currently converting to Pandas before initialising the DMatrix. We should consider using Arrow instead to avoid unnecessary copies. XGBoost has Arrow support - https://github.com/dmlc/xgboost/pull/7512

Yard1 avatar Jul 12 '22 19:07 Yard1

Thanks for adding this! It looked like the changes were split over two PRs, just FYI here is the second https://github.com/dmlc/xgboost/pull/7283

natmod avatar Jul 12 '22 19:07 natmod

And supporting polars dataframe for creating DMatrix in Python?

tonyabracadabra avatar Nov 11 '22 10:11 tonyabracadabra