catboost icon indicating copy to clipboard operation
catboost copied to clipboard

Apache arrow support for I/O

Open Trollgeir opened this issue 4 years ago • 5 comments

catboost version: 1.03 Operating System: ubuntu linux

I'm using the new and promising library Polars for transforming my data, and my goal is to have as little memory impact as possible because of constraints. Would it be possible to support the apache arrow data format as an input for catboost? The underlying dictionary types fit perfect for categorical features.

Trollgeir avatar Jan 19 '22 13:01 Trollgeir

+1 for this

XGBoost added support for Arrow ingest a few months ago here: https://github.com/dmlc/xgboost/pull/7512

braaannigan avatar Dec 07 '22 15:12 braaannigan

This would be fantastic! LightGBM has this functionality available for pretty long time now.

jcierocki avatar Jan 04 '25 10:01 jcierocki

+1

arrowcircle avatar Jan 05 '25 22:01 arrowcircle

+1

It would be great for polars support

anthonygiorgio97 avatar Feb 04 '25 20:02 anthonygiorgio97

+1

aryehklein-rise avatar Sep 29 '25 13:09 aryehklein-rise