pulumi-azure-native
pulumi-azure-native copied to clipboard
Unable to reliably create VirtualNetworkLink (404 not found error)
What happened?
When creating a new VirtualNetworkLink
resource, the operation fails with the following:
azure-native:network:VirtualNetworkLink (vnl-dev-app-blue-hub-we)
error: 1 error occurred:
autorest/azure: Service returned an error. Status=404 Code="ResourceNotFound"
Message="The Resource 'Microsoft.Network/privateDnsZones/app-blue-dev-blue.privatelink.westeurope.azmk8s.io/virtualNetworkLinks/vnl-dev-app-blue-hub-we'
under resource group 'rg-dev-app-cluster-blue-we'
was not found.
For more details please go to https://aka.ms/ARMResourceNotFoundFix"
This is strange for the following reasons:
-
The action that is supposed to create the
VirtualNetworkLink
resource is failing with 404 that the resource is not found. This doesn't seem that it should ever happen? -
The
VirtualNetworkLink
resource was created successfully, despite the Pulumi stack failing: -
When manually deleting the created
VirtualNetworkLink
and runningpulumi up
again, it works. I.e. Pulumi is able to create the resource and complete thepulumi up
operation successfully.
Example
Here the code (TypeScript) used to create the VirtualNetworkLink
resource:
const vnetLinkName = 'vnl-dev-app-blue-hub-we';
const vnetLink = new VirtualNetworkLink(vnetLinkName, {
location: 'Global',
privateZoneName: privateZone.name,
registrationEnabled: false,
resourceGroupName: resourceGroup.name,
virtualNetwork: {
id: hubNetworkId,
},
virtualNetworkLinkName: vnetLinkName,
});
Output of pulumi about
CLI
Version 3.109.0
Go Version go1.22.0
Go Compiler gc
Plugins
NAME VERSION
nodejs unknown
Host
OS Microsoft Windows 11 Pro
Version 10.0.22631 Build 22631
Arch x86_64
This project is written in nodejs: executable='C:\Program Files\nodejs\node.exe' version='v20.11.1'
Additional context
- Using package
"@pulumi/azure-native": "2.30.0"
- It seems we have a 100% repro if the Virtual Network as well as the Private DNS Zone resources are created together with the VirtualNetworkLink resource as part of the same
pulumi up
operation: It always fails when all resources need to be created and then it always works on re-run (when VNet and DNS Zone already exist) - Note that there shouldn't be a race condition, as we reference both the VNet as well as the DNS zone in the VirtualNetworkLink creation (see code above). So Pulumi is aware of the dependency and should wait for the dependencies to be created.
Contributing
Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).
I believe that the source of the 404 is that we do a read of each resource immediately after creation (this improves consistency with later refresh
commands). What's probably happening is that the vnetLink create command returns successfully, but the resource is actually still initializing (possibly due to waiting on some initialization in the zone or vnet given the difference in behavior when those are already provisioned), and the read fails to find it.
Possibly we could be retrying this 404 with a back off to see if it resolves later.
Thanks a lot @mjeffryes for looking into this. Do you have some thoughts on if this will get addressed and by when? It would help us with planning, as currently this is quite an issue in our automation pipeline as we need to manually retry workflows.