service-fabric icon indicating copy to clipboard operation
service-fabric copied to clipboard

[BUG] managed clusters: repeated ARM application registrations fails, even after deleting the app

Open strat-alex opened this issue 2 years ago • 0 comments

Describe the bug I was testing with ARM applications in managed SF clusters. Initially I was testing using an adapted version of the sample powershell script in the traefikproxy app and all worked. When I deleted everything from the SF cluster, and then tried to deploy using bicep. The deployment failed with an issue: FABRIC_E_APPLICATION_TYPE_NOT_FOUND. Detailing the ApplicationType + version were not found in the cluster. When I updated the version in the package, registering worked.

I have not tested this in a regular SF cluster.

Area/Component: ARM

To Reproduce Steps to reproduce the behavior:

  1. Register a new application (and all other steps) using powershell. I used this script:
Connect-ServiceFabricCluster -ConnectionEndpoint @('FQDN:19000') `
    -X509Credential `
    -FindType FindByThumbprint `
    -FindValue 'SOMEVALUE' `
    -StoreLocation CurrentUser `
    -StoreName 'My' `
    -ServerCertThumbprint 'SOMEVALUE2'

$currentVersion = "1.0.2"

# Register and run the Traefik Application
Write-Information "Removing SF application Traefik"
Remove-ServiceFabricApplication -ApplicationName fabric:/traefik -Force

$appTypes = Get-ServiceFabricApplicationType -ApplicationTypeName TraefikType
if ($appTypes) {
    foreach ($appType in $appTypes) {
        Write-Information "Unregistering SF applicationType TraefikType"
        Unregister-ServiceFabricApplicationType -ApplicationTypeName TraefikType -ApplicationTypeVersion $appType.ApplicationTypeVersion -Force
    }
}

$deploy = $false
if ($deploy) {
    Write-Information "Copying package"
    Copy-ServiceFabricApplicationPackage -ApplicationPackagePath .\TraefikProxyApp\ #-ApplicationPackagePathInImageStore traefik

    Write-Information "Registering applicationType"
    Register-ServiceFabricApplicationType -ApplicationPathInImageStore TraefikProxyApp
    $p = @{
        ReverseProxy_FetcherEndpoint          = "7777"
        ReverseProxy_HttpPort                 = "8080"
        ReverseProxy_CertificateSearchKeyword = ""
        ClusterEndpoint                       = "https://localhost:19080"
        CertStoreSearchKey                    = "sfmc"
        ClientCertificate                     = ""
        ClientCertificatePK                   = ""
        ReverseProxy_EnableDashboard          = "true"
        #ReverseProxy_PlacementConstraints="NodeType == NT2"
    }
    $p

    Write-Information "New SF app"
    New-ServiceFabricApplication -ApplicationName fabric:/traefik -ApplicationTypeName TraefikType `
        -ApplicationTypeVersion $currentVersion -ApplicationParameter $p
}
  1. Executing this script multiple times works without issues. The application is removed and then reuploaded, reregistered and restarted as expected. Here I have not raised the version number.
  2. Remove the application (through the portal or the script, doesn't matter).
  3. Start the bicep deployment
var location = 'north europe'
var clusterName = 'MYCLUSTER'

resource managedSFCluster 'Microsoft.ServiceFabric/managedClusters@2021-11-01-preview' existing = {
  name: clusterName
}

resource appType 'Microsoft.ServiceFabric/managedclusters/applicationTypes@2021-11-01-preview' = {
  name: 'TraefikType'
  parent: managedSFCluster
  location: location
  properties: {}
}

resource appTypeVersion 'Microsoft.ServiceFabric/managedclusters/applicationTypes/versions@2021-11-01-preview' = {
  name: '1.0.3'
  parent: appType
  location: location
  properties: {
    appPackageUrl: 'https://FQDN/TraefikProxyApp.1.0.3.sfpkg'
  }
}

resource app 'Microsoft.ServiceFabric/managedclusters/applications@2021-11-01-preview' = {
  name: 'Traefik'
  parent: managedSFCluster
  location: location
  properties: {
    parameters: {
      ReverseProxy_FetcherEndpoint: '7777'
      ReverseProxy_HttpPort: '8080'
      ReverseProxy_CertificateSearchKeyword: ''
      ClusterEndpoint:'https://localhost:19080'
      CertStoreSearchKey: 'sfmc'
      ClientCertificate: ''
      ClientCertificatePK: ''
      ReverseProxy_EnableDashboard: 'true'
    }
    upgradePolicy: {
      recreateApplication: true
    }
    version: resourceId('Microsoft.ServiceFabric/managedclusters/applicationTypes/versions', clusterName, appType.name, appTypeVersion.name)
  }
}
  1. Watch it fail with the missing version error. It could be that you have to deploy and remove the app twice in order to get to this error. It's as though ARM is caching something? Once the error is hit, the only way to deploy is to increase the version.
  2. Increase the version in the sfpkg. Update bicep template with correct version numbers.
  3. Upload + registration is ok now.

Expected behavior If an application is removed, a previously used applicationType and version should be able to register and start without increasing version numbers. As per the powershell scripts functionality.

Observed behavior:

        "statusMessage": "{\"status\":\"Failed\",\"error\":{\"code\":\"ResourceOperationFailure\",\"message\":\"The resource operation completed with terminal provisioning state 'Failed'.\",\"details\":[{\"code\":\"FABRIC_E_APPLICATION_TYPE_NOT_FOUND\",\"message\":\"Application type and version not found\",\"details\":[]}]}}",

Service Fabric Runtime Version: Cluster version 8.2.1571.9590 SKU Basic Current fabric version 8.2.1571.9590

Environment:

  • Azure
  • OS: WindowsServer 2019-Datacenter-with-Containers
  • Version 8.2.1571.9590

Assignees: /cc @microsoft/service-fabric-triage

strat-alex avatar Mar 22 '22 09:03 strat-alex