scikit-learn-intelex icon indicating copy to clipboard operation
scikit-learn-intelex copied to clipboard

New feature required- Support different data sources

Open moheikal79 opened this issue 3 years ago • 2 comments
trafficstars

Hi, Currently daal4py accepts only CSV, np.array and pandas dataframes. If data more than to fit into memory and located in databases different solutions are come up such as polars df which uses connectorx to access data stored in databases and ibis that uses databases as backend such as ClickHouse. So, it would be great if daal4py can accept different data sources, to handle bigger data sizes.

here is the error I got when try to read a clickhouse table using ibis into daal4py Got type 'ClickhouseTable' when expecting string, array, or list of 1d-arrays.

here is the error I got when try to read a polars df into daal4py Got type 'DataFrame' when expecting string, array, or list of 1d-arrays.

Regards Mohamed

moheikal79 avatar Sep 22 '22 04:09 moheikal79

We are considering adding support of different dataframes - https://data-apis.org/dataframe-protocol/latest/purpose_and_scope.html

napetrov avatar Apr 28 '23 15:04 napetrov