arrow icon indicating copy to clipboard operation
arrow copied to clipboard

Raise error when multiple IPC streams are received

Open pmm-motif opened this issue 1 year ago • 0 comments

It's perhaps an obscure case, but I had issues with using Node.js DuckDB library with register_buffer() when multiple Arrow Table streams were passed as an argument. This use-case somewhat works with DuckDB WASM (it's possible to append additional Arrow Tables) so I initially thought it's some sort of bug.

However, as I was investigating this, I realized that I'm probably using it incorrectly in the first place (though it's not clearly documented). I.e. when multiple IPC Buffers are provided, they must be all part of the same Arrow IPC Stream.

The underlying Arrow IPC Reader seems to be silently stopping when getting into EOS. What I'm proposing in this PR is detecting when more messages are available after getting EOS and throwing an exception rather than just silently consuming these.

It's plausible that I'm missing something or that this check should go Apache Arrow instead. Curious of your thoughts on this

Thank you!

pmm-motif avatar Nov 25 '23 19:11 pmm-motif