arrow icon indicating copy to clipboard operation
arrow copied to clipboard

What purpose does ArrowRecordBatch solve?

Open shivamka1 opened this issue 3 years ago • 1 comments

I was going through the Flight Java Example and was wondering if we can persist VectorSchemaRoot directly in the Dataset instead of ArrowRecordBatch list?

class Dataset implements AutoCloseable {
    private final List<ArrowRecordBatch> batches;
    private final Schema schema;
    private final long rows;
    public Dataset(List<ArrowRecordBatch> batches, Schema schema, long rows) {
        this.batches = batches;
        this.schema = schema;
        this.rows = rows;
    }
    public List<ArrowRecordBatch> getBatches() {
        return batches;
    }
    public Schema getSchema() {
        return schema;
    }
    public long getRows() {
        return rows;
    }
    @Override
    public void close() throws Exception {
        AutoCloseables.close(batches);
    }
}

shivamka1 avatar Oct 04 '22 07:10 shivamka1

The record batch is a representation of a RecordBatch IPC message, which is used for Dataset transfer. A VectorSchemaRoot isn't implemented in a way that the conversion process could be skipped.

lwhite1 avatar Oct 06 '22 14:10 lwhite1

@iamsmkr do you still have questions here?

lidavidm avatar Apr 20 '23 02:04 lidavidm