datafusion-python icon indicating copy to clipboard operation
datafusion-python copied to clipboard

Implement PyArrow Dataset TableProvider

Open kdbrooks opened this issue 3 years ago • 4 comments
trafficstars

This implements a PyArrow Dataset TableProvider that allows for using Datasets as tables in Datafusion.

Fixes #10

kdbrooks avatar Jul 19 '22 20:07 kdbrooks

Also FYI we are moving this repo, so they might ask to re-open this PR at the new location soon. https://github.com/apache/arrow-datafusion-python/pull/5

wjones127 avatar Jul 21 '22 20:07 wjones127

cc @andygrove

wjones127 avatar Jul 21 '22 20:07 wjones127

Thanks @kylebrooks-8451 this is looking very cool. As @wjones127 mentioned we are just in the process of moving development to https://github.com/apache/arrow-datafusion-python so would you mind opening the PR there?

andygrove avatar Jul 22 '22 09:07 andygrove

Thanks @kylebrooks-8451 this is looking very cool. As @wjones127 mentioned we are just in the process of moving development to https://github.com/apache/arrow-datafusion-python so would you mind opening the PR there?

Thanks @andygrove! This PR has been moved to apache/arrow-datafusion-python#9. We can close this PR if you wish. I'm unable to make sure this works with DataFusion 10.0.0 because of this error:

error: failed to select a version for the requirement `parquet = "^18.0.0"`
candidate versions found which didn't match: 15.0.0, 14.0.0, 13.0.0, ...

kdbrooks avatar Jul 22 '22 12:07 kdbrooks