connector-x
connector-x copied to clipboard
Provide interface stream out arrow RecordBatch
it would be very helpful. Any chances to see this in the future?
We have initialized the arrow batch iterator for rust and cpp library. Need more work in terms of testing and exposing to python library.
I think it will be an awesome addition to be able to get a RecordBatchReader directly from read_sql which only materializes the record batches (sends queries to DB) when the user requests read_next_batch.
@wangxiaoying you think the testing is done to expose arrow record batch iterator to python side, is there anyway I can help with this?
I think it will be an awesome addition to be able to get a RecordBatchReader directly from
read_sqlwhich only materializes the record batches (sends queries to DB) when the user requestsread_next_batch.@wangxiaoying you think the testing is done to expose arrow record batch iterator to python side, is there anyway I can help with this?
Yes, I think we can definitely enable the record batch reader. Please feel free to submit a PR!
@wangxiaoying I checked this out today and here are my findings
arrow_rbcan be easily added on the python side as a return_type and can return a generator ofRecordBatches. I did this and it works as expected.- After doing this there was no performance benefit because the dispatcher is eager.
in order to make the record batch path truely lazy, im thinking
- ~The dispatcher can to have an alternate implementation of
runwhere the operations don't happen eagerly but is backed by an iterator.~ [already available] - this iterator is also exposed on python side which is passed to the
RecordBatchReader. - When this
RecordBatchReaderis consumed, the operations happen at that time calling thenext()on the iterator.
~This seems quite complicated to me considering my limited understanding of this code base. I'll still try to give this a shot, but if you have any suggestion please let me know.~
actually, scratch the above, I managed to get this working exactly as expected. will raise a PR today for your review. :D