greptimedb
greptimedb copied to clipboard
fix: `test_dist_table_scan` block
I hereby agree to the terms of the GreptimeDB CLA
What's changed and what's your intention?
test_dist_table_scan
could be blocking for multiple reasons:
- the tasks that were producing the RecordBatch result items ignored the sending error
-
test_dist_table_scan
usesCoalescePartitionsExec
to merge partitions data -
CoalescePartitionsExec
will spawn multiple tasks that each of which gathers one partition's RecordBatch result to a final output stream, through a mpsc channel - the task ignores the sending error, thus even if the receiver has decided to quit, the sender(the task) don't care (fixed in this PR: https://github.com/apache/arrow-datafusion/pull/3276/files)
- the task thus keeps polling the stream, in a
while let Some
loop
-
- the stream being polled always returns
Some
if the stream is failed to init- never return None means we have an infinite loop in the above task
- casuing tokio runtime busy, other tasks like heartbeats are not likely to proceed (hence the many error backtrace logs)
The fix is simple, upgrade to newer version of DataFusion(done), and make the stream able to emit None.
Checklist
- [ ] I have written the necessary rustdoc comments.
- [x] I have added the necessary unit tests and integration tests.
Refer to a related PR or issue link (optional)
Codecov Report
Merging #761 (61e1f2e) into develop (66bca11) will decrease coverage by
0.07%
. The diff coverage is97.05%
.
@@ Coverage Diff @@
## develop #761 +/- ##
===========================================
- Coverage 85.36% 85.28% -0.08%
===========================================
Files 414 414
Lines 54167 54388 +221
===========================================
+ Hits 46240 46387 +147
- Misses 7927 8001 +74
Flag | Coverage Δ | |
---|---|---|
rust | 85.28% <97.05%> (-0.08%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
src/common/recordbatch/src/adapter.rs | 92.25% <97.02%> (+15.86%) |
:arrow_up: |
src/common/recordbatch/src/error.rs | 64.28% <100.00%> (ø) |
|
src/object-store/src/test_util.rs | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
tests-integration/src/test_util.rs | 87.98% <0.00%> (-11.54%) |
:arrow_down: |
src/datanode/src/instance.rs | 32.22% <0.00%> (-11.12%) |
:arrow_down: |
src/datanode/src/sql/insert.rs | 80.43% <0.00%> (-9.79%) |
:arrow_down: |
src/servers/src/grpc.rs | 91.17% <0.00%> (-2.95%) |
:arrow_down: |
src/query/src/datafusion/error.rs | 92.00% <0.00%> (-1.25%) |
:arrow_down: |
src/meta-srv/src/service/heartbeat.rs | 86.31% <0.00%> (-1.06%) |
:arrow_down: |
src/frontend/src/instance/distributed.rs | 79.92% <0.00%> (-0.90%) |
:arrow_down: |
... and 17 more |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
@waynexia all comments resolved, PTAL