azure
azure copied to clipboard
Question/Discussion: Retry operations (general API availability)
SUMMARY
Hi, I am using ansible for a wide variety of operations in azure. During recent cases MSFT said that they do not guarantee availability of the API and I should automatically retry failed operations. So in general I could do that with ansible logic, but my question is more at which layer does it make the most sense to do a retry logic? On my tasks -- on the ansible module level -- on the azure-sdk-for-python level?
An example of a concrete problem I currently have would be with quering backup resources, it seems the Azure Front Door -> Recovery Vault Backend connection has problems every couple thousand requests
It produces actual html code error responses like this:
Error in fetching recovery point ("<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML
1.0 Transitional//EN'
'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'><html
xmlns='http://www.w3.org/1999/xhtml'><head><meta content='text/html;
charset=utf-8' http-equiv='content-type'/><style type='text/css'>body
{font-family:Arial; margin-left:40px; }img { border:0 none; }#content {
margin-left: auto; margin-right: auto }#message h2 { font-size: 20px;
font-weight: normal; color: #000000; margin: 34px 0px 0px 0px }#message p {
font-size: 13px; color: #000000; margin: 7px 0px 0px0px}#errorref { font-size:
11px; color: #737373; margin-top: 41px
}</style><title>AzureResourceManager</title></head><body><div
id='content'><div id='message'><h2>Our services aren't available right
now</h2><p>We're working to restore all services as soon as possible. Please
check back soon.</p></div><div id='errorref'><span>Ref A:
xxx Ref B: xx Ref C:
2024-09-17T17:37:32Z</span></div></div></body></html>", 503)
ISSUE TYPE
- Question/Discussion