iceberg-rust
iceberg-rust copied to clipboard
refactor: TableScan file plan generation now implemented purely in streams rather than channels
The current implementation of scan planning leaves a lot to be desired. The channel-based approach is hard-to-follow, error-prone, and does not properly support backpressure.
Not only that, but in real-life usage I've been experiencing intermittent deadlock with it, and haven't been able to track down the cause.
There's a source of deadlock present that I have spotted in the existing implementation too. fetch_manifest_and_stream_manifest_entries attempts to push to bounded channels, and so will block if the channel it is pushing to is full. If the head-of-line data file context is awaiting on the DeleteFileIndex but the delete files that it depends upon are in a manifest that is yet to enter the delete file manifest context channel, then the pipeline is in deadlock.
@Xuanwo's refactor of the Arrow reader showed that a stream-based approach can be more elegant, address the lack of backpressure, and be more reliable. This refactor brings those same advantages to the plan phase.
@liurenjie1024 / @Xuanwo It would be great to get a review of this if you could find the time. This fully streaming approach is much cleaner and easier to follow, with better error handling, and perhaps more importantly addresses a source of deadlock.
We just hit this deadlock in production - @sdd thank you for outlining the issue so clearly! Saved me a bunch of debugging.