perspective icon indicating copy to clipboard operation
perspective copied to clipboard

Feed a stream of RecordBatches to a table (Python)

Open dennybritz opened this issue 3 years ago • 0 comments

Feature Request

Description of Problem:

This is related to https://github.com/finos/perspective/issues/1157 but not quite the same.

I have a real-time arrow IPC stream and would like update a perspective table batch by batch as data comes in. Currently it seems like I need to pass each batch through the following function and prepend the schema:

def to_arrow_bytes(schema, batch):
    stream = pa.BufferOutputStream()
    writer = pa.RecordBatchStreamWriter(stream, schema)
    writer.write_batch(batch)
    return stream.getvalue().to_pybytes()

That seems quite inefficient, especially when I'm already receiving the batch as a byte buffer on a socket. Is there any way to pass the received RecordBatch bytes directly? I guess this is not possible because the arrow schema may change between updates?

dennybritz avatar May 15 '22 10:05 dennybritz