vapour icon indicating copy to clipboard operation
vapour copied to clipboard

arrow notes

Open mdsumner opened this issue 2 years ago • 0 comments

DD

I'm pretty sure the batch size you can set with a gdal property before you get the stream. If you have a pyarrrow.RecordBatchReader, which I'm assuming is what GetArrowStreamAsPyArrow gives you, you can consume it one batch at a time or call https://arrow.apache.org/docs/python/generated/pyarrow.RecordBatchReader.html#pyarrow.RecordBatchReader.read_all to read it into a Table (probably what you want unless you're engineering some streaming yourself).
In R you might have a nanoarrow_array_stream, on which you call convert_array_stream() to get a data.frame. I think you can also call as_arrow_table() on the array stream (or as_arrow_table(as_record_batch_reader(array_stream)) if that doesn't work)

mdsumner avatar Oct 03 '23 06:10 mdsumner