Storage API returns records byte array containing schema bytes
What happened?
When you try to run default storage api example with emulator, it correctly fetches schema and message, but arrow decoding always produces empty table
What did you expect to happen?
Output table data
How can we reproduce it (as minimally and precisely as possible)?
BigQuery storage API example: https://cloud.google.com/bigquery/docs/reference/storage/libraries#use
Requires custom grpc client option with emulator url:
grpcclient, err := grpc.NewClient("0.0.0.0:9060", grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
log.Fatal(err)
}
bqReadClient, err := bqStorage.NewBigQueryReadClient(
ctx,
option.WithGRPCConn(grpcclient)
option.WithoutAuthentication(),
)
Anything else we need to know?
I noticed that in official example they create decoding buffer from schema array (in processArrow function) and append record batch to it
undecoded := rows.GetArrowRecordBatch().GetSerializedRecordBatch()
if len(undecoded) > 0 {
buf = bytes.NewBuffer(schema)
buf.Write(undecoded)
r, err = ipc.NewReader(buf, ipc.WithAllocator(mem), ipc.WithSchema(aschema))
//... other code
}
But in your test you don`t use schema array but only record batch:
undecoded := rows.GetArrowRecordBatch().GetSerializedRecordBatch()
if len(undecoded) > 0 {
buf = bytes.NewBuffer(undecoded)
r, err = ipc.NewReader(buf, ipc.WithAllocator(mem), ipc.WithSchema(aschema))
// ... other code
}
After hours of debugging I saw that your record batches already contains schema bytes. And when I tried to use the second way with real BigQuery source, I gained:
error processing arrow: arrow/ipc: invalid message type (got=RecordBatch, want=Schema)
So it is more a question: why do you send schema bytes in batch? It is not a problem but this feature requires to do a specific conditions depending on using emulator or not.
I've implemented a fix for the SerializedRecordBatch message in the Recidiviz fork of the emulator https://github.com/Recidiviz/bigquery-emulator/releases/tag/v0.6.6-recidiviz.0