aspnetcore icon indicating copy to clipboard operation
aspnetcore copied to clipboard

Quarantine RunAsync_WaitsForCompletion_Single_RegistrationParameters

Open captainsafia opened this issue 2 years ago • 10 comments

Failing Test(s)

  • Microsoft.Extensions.Diagnostics.HealthChecks.HealthCheckPublisherHostedServiceTest.RunAsync_WaitsForCompletion_Single_RegistrationParameters

Error Message

Assert.Contains() Failure
Not found: CheckDelay1Period9
In value:  String[] ["CheckDefault", "CheckDelay1Period18", "CheckDelay2Period18"]

Stacktrace

   at Microsoft.Extensions.Diagnostics.HealthChecks.HealthCheckPublisherHostedServiceTest.RunAsync_WaitsForCompletion_Single_RegistrationParameters() in /_/src/HealthChecks/HealthChecks/test/HealthCheckPublisherHostedServiceTest.cs:line 388
--- End of stack trace from previous location ---

Logs


Build

  • https://dev.azure.com/dnceng-public/public/_build/results?buildId=356938&view=ms.vss-test-web.build-test-results-tab&runId=7511354&resultId=122241&paneView=debug

captainsafia avatar Jul 31 '23 04:07 captainsafia

Hi, I have spent some time investigating this and in my opinion the issue is a race condition with the TestPublisher here. My observation is the flaky test is writing mulitple reports from multiple timer at around the same time. In my opinion the fix would be easy, using a thread safe collection. Do you think this worth pursuing? ( I can implement this modification. )

feherzsolt avatar Aug 08 '25 21:08 feherzsolt

In my opinion the same applies to https://github.com/dotnet/aspnetcore/issues/56245

feherzsolt avatar Aug 08 '25 21:08 feherzsolt

@feherzsolt That seems like a reasonable thing to try. We can see if using a concurrent collection improves things here.

captainsafia avatar Aug 11 '25 14:08 captainsafia

Hi i have tried to find a way to validate these modifications. In my opinion the best way would be to check test failure rate for these test. I was looking to find this data, got lost a bit. Could you help what would be the best way to validate please?

feherzsolt avatar Aug 17 '25 11:08 feherzsolt

@feherzsolt We typically apply a test-fixed label to the issue then monitor the test for 30 days before unquarantining to verify that the fix worked.

captainsafia avatar Aug 18 '25 16:08 captainsafia

Thank you, let's do that. I have created the PR https://github.com/dotnet/aspnetcore/pull/63299

feherzsolt avatar Aug 18 '25 20:08 feherzsolt

Thank you!

feherzsolt avatar Dec 10 '25 23:12 feherzsolt

That test was not failing for at least 30 days, why did we need https://github.com/dotnet/aspnetcore/pull/63299?

ilonatommy avatar Dec 11 '25 13:12 ilonatommy

Let's keep it open for another 30 days after the new change.

ilonatommy avatar Dec 11 '25 13:12 ilonatommy

That test was not failing for at least 30 days, why did we need #63299?

Yes it was. https://dev.azure.com/dnceng-public/public/_test/analytics?definitionId=86&contextType=build Could also repro locally if running the test with [Repeat(1000)].

BrennanConroy avatar Dec 11 '25 17:12 BrennanConroy