AzOps
AzOps copied to clipboard
AzOps - Discovery Performance Issues
AzOps - Pull pipeline of AzOps Accelerator run in Azure DevOps fails to grab information about all of the subscriptions and times out.
SPN has been given privileges over root management group with about 250 subscriptions. The build times out after 4 hours That's what I've set in the pipeline itself. I'm actually not sure if it does anything for that long (or just hangs midway) because log file is too large to browse it effectively.
Here's a screenshot:

Has anyone experienced anything like that? Is this tool designed to handle so many subscriptions? Or maybe is it some problem with DevOps pool? Any help will be much appreciated.
Hey @reckitt-maciejglowacki ,
We have customers with over 1000 subscription running this, so it should definitively work.
Can you share how the below settings are configured in the settings.json file?

I'm using defaults from https://github.com/Azure/AzOps-Accelerator/blob/main/settings.json
The only thing that I have changed is timeoutInMinutes in the pipeline itself.
Thank you. For troubleshooting purposes, could you please try and change the Core.SkipResourceGroup setting to true and report back the results?
That certainly helped :) The pipeline now runs just about 2 hours but it still fails due to #439
Are there any disadvantages to skipping rg discovery?
You would only want RG discovery if you intend to do RG level deployments (like VMs or other resources) with AzOps, which I assume is not the intent here?
It's not but we do want to be able to differentiate policy and role assignments between different resource groups.
Are you going to manage that from a central platform perspective via AzOps or let the individual LZ teams do it?
We're doing it centrally I'm afraid
Understood. Can you try to change back the setting to discover RGs and change the pipeline timeout in ADO to 6 hrs and see if it completes successfully?
Okay. I'll do that today and let you know the results.
Same :(
Thank you for confirming this. We will take a look at this and see what we can do. The advise would be to disable resource group discovery for now.
Hi @daltondhcp Just wanted to check have you managed to look into this issue? Thanks
@daltondhcp bump
Hey @reckitt-maciejglowacki - we are currently working on this, unfortunately no short term fix. I will make sure to keep the progress updated in this issue.
Got it. Thanks for the info.
Hi, any update on this?
Hey @reckitt-maciejglowacki,
Unfortunately we are still investigating this but as a workaround for now you could use Self-hosted agents that have an unlimited run time as per: https://docs.microsoft.com/en-us/azure/devops/pipelines/process/phases?view=azure-devops&tabs=yaml#timeouts
Guidance on creating on self-hosted agents can be found here: https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/agents?view=azure-devops&tabs=browser#install
Hope that helps move you forward in the near term 👍
Hi. Just wanted to let you know that this update definitely hasn't fixed anything. Quite the opposite.
I'm getting various random errors when trying to execute this in an ADO pipeline. Even when it does run uninterupted (which seems completely random) it times out after an hour.



Hi @reckitt-maciejglowacki, thanks for updating this issue and sharing.
I agree with your experience in regards to a bunch of different errors ultimately causing pipeline executions to fail.
We started seeing this as well once we release 2.0.0 into the wild and determined that majority of the different errors are due to the expanded usage of processing in parallel. When combined with an execution machine containing a "high" throttle limit and "low" amount of cores the errors starts to show a lot.
Our response to this was to implement logic in the module to detect these misalignments and override the throttle limit when detected. In addition to that we created a wiki for performance considerations.
Since release 2.0.2 improvements are included in AzOps module intended to resolve the behavior.
Could you confirm if you still have these issues on the latest release? (if yes, then lets re-open the issue).
Thank you @Jefajers Latest update does seem to work. I haven't tried it on a resource level yet but it runs well for subscriptions and resource groups.
Turns out my enthusiasm was premature..

Can you share the details of the errors? Same as before or something else?
The same I think:

Those seem to appear above certain number of objects but I haven't drilled it down yet.
We're using AZOPS_MODULE_VERSION 2.1.2 and pretty much default settings.json from AzOps-Accelerator project with "Core.SkipResourceGroup": false