mimir icon indicating copy to clipboard operation
mimir copied to clipboard

Flaky TestSchedulerProcessor_processQueriesOnSingleStream

Open pracucci opened this issue 2 years ago • 4 comments

Seen a flaky run of TestSchedulerProcessor_processQueriesOnSingleStream (CI):

--- FAIL: TestSchedulerProcessor_processQueriesOnSingleStream (1.65s)
    --- FAIL: TestSchedulerProcessor_processQueriesOnSingleStream/should_not_cancel_query_execution_if_scheduler_client_returns_a_non-cancellation_error (0.51s)
        scheduler_processor_test.go:253: 
            	Error Trace:	/__w/mimir/mimir/pkg/querier/worker/scheduler_processor_test.go:253
            	Error:      	Not equal: 
            	            	expected: 2
            	            	actual  : 1
            	Test:       	TestSchedulerProcessor_processQueriesOnSingleStream/should_not_cancel_query_execution_if_scheduler_client_returns_a_non-cancellation_error
            	Messages:   	Expected number of calls (2) does not match the actual number of calls (1).
FAIL
FAIL	github.com/grafana/mimir/pkg/querier/worker	4.198s

pracucci avatar Apr 12 '24 08:04 pracucci

Another example in https://github.com/grafana/mimir/actions/runs/9909349624/job/27377221017?pr=8685:

--- FAIL: TestSchedulerProcessor_processQueriesOnSingleStream (1.54s)
    --- FAIL: TestSchedulerProcessor_processQueriesOnSingleStream/should_not_cancel_query_execution_if_scheduler_client_returns_a_non-cancellation_error (0.52s)
        scheduler_processor_test.go:255: 
            	Error Trace:	/__w/mimir/mimir/pkg/querier/worker/scheduler_processor_test.go:255
            	Error:      	Not equal: 
            	            	expected: 2
            	            	actual  : 1
            	Test:       	TestSchedulerProcessor_processQueriesOnSingleStream/should_not_cancel_query_execution_if_scheduler_client_returns_a_non-cancellation_error
            	Messages:   	Expected number of calls (2) does not match the actual number of calls (1).

zenador avatar Jul 12 '24 14:07 zenador

another instance https://github.com/grafana/mimir/actions/runs/10955420076/job/30419274402?pr=9344

--- FAIL: TestSchedulerProcessor_processQueriesOnSingleStream (1.64s)
    --- FAIL: TestSchedulerProcessor_processQueriesOnSingleStream/should_not_cancel_query_execution_if_scheduler_client_returns_a_non-cancellation_error (0.51s)
        scheduler_processor_test.go:255: 
            	Error Trace:	/__w/mimir/mimir/pkg/querier/worker/scheduler_processor_test.go:255
            	Error:      	Not equal: 
            	            	expected: 2
            	            	actual  : 1
            	Test:       	TestSchedulerProcessor_processQueriesOnSingleStream/should_not_cancel_query_execution_if_scheduler_client_returns_a_non-cancellation_error
            	Messages:   	Expected number of calls (2) does not match the actual number of calls (1).
FAIL

dimitarvdimitrov avatar Sep 20 '24 08:09 dimitarvdimitrov

another instance https://github.com/grafana/mimir/actions/runs/11040649844/job/30669127966?pr=9413

 --- FAIL: TestSchedulerProcessor_processQueriesOnSingleStream (1.56s)
    --- FAIL: TestSchedulerProcessor_processQueriesOnSingleStream/should_not_cancel_query_execution_if_scheduler_client_returns_a_non-cancellation_error (0.51s)
        scheduler_processor_test.go:255: 
            	Error Trace:	/__w/mimir/mimir/pkg/querier/worker/scheduler_processor_test.go:255
            	Error:      	Not equal: 
            	            	expected: 2
            	            	actual  : 1
            	Test:       	TestSchedulerProcessor_processQueriesOnSingleStream/should_not_cancel_query_execution_if_scheduler_client_returns_a_non-cancellation_error
            	Messages:   	Expected number of calls (2) does not match the actual number of calls (1).
FAIL
FAIL	github.com/grafana/mimir/pkg/querier/worker	5.347s

56quarters avatar Sep 25 '24 20:09 56quarters

Hey there! This test is still being flaky

This test failed 10 times across 5 different branches in the last 7d.

Who might know about this?

  • @charleskorn - made a relevant commit on 2025-10-28: 45ae002c6641208d44d0f1ee075abc94c2b2e8a8 "Don't log error notifying scheduler about finished query in queriers if the query is cancelled (#13186)"
  • @charleskorn - made a relevant commit on 2025-09-18: 787b3f0bba2dd2b9f86bbea7bd765640b3cf9fe6 "Remote execution: add support for strong consistency, chunk info logging and other features that rely on HTTP headers (#12745)"
  • @charleskorn - made a relevant commit on 2025-08-14: 51cc6612304e689743b6b536e6f1512dee9e10a9 "Remote execution of query plans in queriers (#12302)"

If any of you have a few minutes, could you take a look? You might have context on what could be causing the flakiness.

Recent failures

💡 Check the issue description above for investigation tips and next steps!

Thanks for helping keep our tests reliable!

github-actions[bot] avatar Dec 15 '25 02:12 github-actions[bot]