azure-sdk-for-cpp icon indicating copy to clipboard operation
azure-sdk-for-cpp copied to clipboard

Add retries to DisableCrlValidation test to improve reliability

Open ahsonkhan opened this issue 10 months ago • 6 comments

Fix https://github.com/Azure/azure-sdk-for-cpp/issues/5533

It's easiest to review the actual changes in the PR without whitespaces: https://github.com/Azure/azure-sdk-for-cpp/pull/5537/files?diff=split&w=1

ahsonkhan avatar Apr 16 '24 04:04 ahsonkhan

Why is this only relevant in this test? I would expect we'd need to handle this problem in the product as well.

How do we know the transport exception is due to the network and not how the request is formulated?

RickWinter avatar Apr 16 '24 16:04 RickWinter

Why is this only relevant in this test?

The issue linked has more info. This is the only test that's failing intermittently due to the CRL validation check timing out for our test site. This test enables that TransportOptions EnableCertificateRevocationListCheck setting. For others, this setting is off by default (or use "real" sites).

I would expect we'd need to handle this problem in the product as well.

Ability to deal with transient network failures is handled in the product, via the RetryPolicy. This test is directly calling the transport layer. See the pipeline being used in the tests: https://github.com/Azure/azure-sdk-for-cpp/blob/2c83fc6b743f5f91b3f845a1b8c0076048aeccd1/sdk/core/azure-core/test/ut/transport_policy_options.cpp#L273-L276

How do we know the transport exception is due to the network and not how the request is formulated?

Because it is intermittent, at 95%+ success rate, and we know the error. The CRL validation check is hitting a transient timeout. If there was an issue with the request, it would be more determinsitic and we'd see different error behavior.

ahsonkhan avatar Apr 16 '24 17:04 ahsonkhan

/azp run cpp - core - ci

ahsonkhan avatar Apr 16 '24 17:04 ahsonkhan

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines[bot] avatar Apr 16 '24 17:04 azure-pipelines[bot]

@danieljurek CI pipeline runs keep getting canceled after 45 minutes. Can you please help with this? Otherwise green runs aren't showing up as passing.

ahsonkhan avatar Apr 16 '24 18:04 ahsonkhan

The win2022 image has been deleted at that version. we'll have to roll forward to the latest image which has the problems detailed here: https://github.com/Azure/azure-sdk-for-cpp/issues/5483 ... I'm working on removing unneeded packages based on the mitigation described here: https://github.com/actions/runner-images/issues/9701

danieljurek avatar Apr 16 '24 21:04 danieljurek