azure-dev
azure-dev copied to clipboard
Reuse configuration + clean up resources upon provision failure
When a developer creates an environment and runs azd provision/azd up, they can run into blocking problems that are out of their control (e.g. if the region they selected during environment creation is busy/at capacity).
In this case, if the region is busy, we require the user to effectively start over, delete/create a new environment/run azd down and try again. This is slow and pretty cumbersome.
Instead, it'd be ideal if we could reuse pieces of information that the developer is likely to want to keep the same and make it easy for them to retry their desired action.
For example, this could look like:
- Developer runs
azd up - Developer configures their environment by passing in the environment name, region and subscription
- Developer sees error that the region is busy and that the provision has failed as a result of a capacity related issue
- Developer is prompted to select another region (we keep the environment name and subscription selected)
- After picking a suitable region, provision and deploy finish as intended
Under the hood, we'd:
- Adjust the region in the
azure.yaml - Either deprovision anything that can't be modified to reflect the new region, or change the region in place (optimizing for the quickest operation)
- Maybe other things???
There also might be other cases where this type of retry experience would be helpful.
cc: @puicchan
We do have azd env set command that can be used to switch an environment to a different location (azd env set AZURE_LOCATION someNewRegion). This, together with azd down should get you pretty close to what you want, although one can argue the env set command is not super discoverable, or very well documented.
azd down could also really benefit from introducing a --no-wait option.
I think figuring out what to do here is going to require a bunch of deep thinking, there are a lot of things we need to take into account. In general, every resource could end up in a different region (for example, today we take a location parameter to our templates and use that single parameter for everything, but there's no reason we couldn't take multiple parameters and use one value for some resources and the other value for the other resources, or hard code resources, etc).
I don't think it's possible to move already provisioned resources (or if you can it will be service specific) non-destructively, which will make moving from one location to another behind the scenes difficult.
A related problem is that we need to make sure when you change the value of the location parameter, what the behavior of the next azd provision should be. In a system like Pulumi or Terraform, we'd actually see the region change as a difference and see a delete and creation scheduled to get the resource into the new location. I'm not sure what ARM would do, it might give an error saying the resource already exists in a different region (as it does for deployments).
Good points @ellismg , but what you are saying makes me inclined to not touch this ball of complexity with 10-foot pole. Instead, we could just say
locationis an environment setting, like any other. It just happens to be a required one for many of our templates, so we ask about it up-front.- How the system reacts to changing location, depends on how the deployment is structured/defined, as well as whether you are using Bicep, Terraform, or something else.
And then we just need to figure out if we can established some conventions for our templates that make @savannahostrowski happy.