perspective
perspective copied to clipboard
Feed a stream of RecordBatches to a table (Python)
Feature Request
Description of Problem:
This is related to https://github.com/finos/perspective/issues/1157 but not quite the same.
I have a real-time arrow IPC stream and would like update a perspective table batch by batch as data comes in. Currently it seems like I need to pass each batch through the following function and prepend the schema:
def to_arrow_bytes(schema, batch):
stream = pa.BufferOutputStream()
writer = pa.RecordBatchStreamWriter(stream, schema)
writer.write_batch(batch)
return stream.getvalue().to_pybytes()
That seems quite inefficient, especially when I'm already receiving the batch as a byte buffer on a socket. Is there any way to pass the received RecordBatch bytes directly? I guess this is not possible because the arrow schema may change between updates?