poco icon indicating copy to clipboard operation
poco copied to clipboard

testWaitDequeue() intermittently hangs

Open aleks-f opened this issue 9 months ago • 7 comments

Describe the bug NotificationQueueTest::testWaitDequeue() intermittently hangs. we worked around it with CI retries, but it does happen

To Reproduce happens intermittently in CI

Please add relevant environment information: looks like it happens mostly on mac

aleks-f avatar Mar 12 '25 09:03 aleks-f

@aleks-f , do you have a link to CI log where this happened?

matejk avatar Mar 25 '25 09:03 matejk

A modified unit tests that runs testWaitQueue() for a 100.000 times did not trigger the hang on macOS.

There must be some other factor to be considered.

matejk avatar Mar 25 '25 11:03 matejk

@aleks-f , do you have a link to CI log where this happened?

no, it is not easy to find because it gets swallowed by retries. But I am certain it happens because that was one of the reasons we introduced retires. see with @cunj123 if possible to alert on retry or something like that

aleks-f avatar Mar 25 '25 11:03 aleks-f

@matejk it's possible to see when retries happen by looking at the warnings in the CI runs, for example the third and fourth warnings in this run. You can click the warnings to see where the retries happened.

cunj123 avatar Mar 25 '25 12:03 cunj123

@cunj123 , @aleks-f.

Error happened on Windows with static build and it is not the same as reported in this issue:

1: class CppUnit::TestCaller<class TaskManagerTest>.testError
    "to.progress() == 0.5"
    in "D:\a\poco\poco\Foundation\testsuite\src\TaskManagerTest.cpp", line 358

I wanted to reproduce original problem by repeating the test hundreds of times in a row but it did not fail.

matejk avatar Apr 02 '25 15:04 matejk

I wanted to reproduce original problem by repeating the test hundreds of times in a row but it did not fail.

I want to say floating point rounding error, but I don't think if it is possible to be different on the same machine. In any case, comparing floats directly is usually a bad idea

aleks-f avatar Apr 02 '25 15:04 aleks-f

I checked lots of recent Compile and Testrun actions. testWaitDequeue never failed.

It was TaskManagerTest.testError a few times. I checked the test code and the condition can fail if the timing is not completely correct.

matejk avatar Apr 02 '25 15:04 matejk