service-fabric icon indicating copy to clipboard operation
service-fabric copied to clipboard

Connect-ServiceFabricCluster : No cluster endpoint is reachable, please check if there is connectivity/firewall/DNS issue.

Open KyleTheAutomator opened this issue 6 years ago • 31 comments

I've downloaded the Service Fabric SDK for VS 2017 from here: http://www.microsoft.com/web/handlers/webpi.ashx?command=getinstallerredirect&appid=MicrosoftAzure-ServiceFabric-CoreSDK

The initial install on my Windows 10 v1709 workstation (fully patched) completes successfully. The problem manifests when I try to setup a cluster:

C:\Program Files\Microsoft SDKs\Service Fabric\ClusterSetup
λ  .\DevClusterSetup.ps1

Using Cluster Data Root: C:\SfDevCluster\Data
Using Cluster Log Root: C:\SfDevCluster\Log

The generated json path is C:\Users\kthompson\AppData\Local\Temp\tmp3B1A.tmp.json
Processing and validating cluster config.
Create node configuration succeeded
Starting service FabricHostSvc. This may take a few minutes...

Waiting for Service Fabric Cluster to be ready. This may take a few minutes...
Local Cluster ready status: 4% completed.
Local Cluster ready status: 8% completed.
Local Cluster ready status: 12% completed.
Local Cluster ready status: 17% completed.
Local Cluster ready status: 21% completed.
Local Cluster ready status: 25% completed.
Local Cluster ready status: 29% completed.
Local Cluster ready status: 33% completed.
Local Cluster ready status: 38% completed.
Local Cluster ready status: 42% completed.
Local Cluster ready status: 46% completed.
Local Cluster ready status: 50% completed.
Local Cluster ready status: 54% completed.
Local Cluster ready status: 58% completed.
Local Cluster ready status: 62% completed.
Local Cluster ready status: 67% completed.
Local Cluster ready status: 71% completed.
Local Cluster ready status: 75% completed.
Local Cluster ready status: 79% completed.
Local Cluster ready status: 83% completed.
Local Cluster ready status: 88% completed.
Local Cluster ready status: 92% completed.
Local Cluster ready status: 96% completed.
Local Cluster ready status: 100% completed.
WARNING: Service Fabric Cluster is taking longer than expected to connect.

Waiting for Naming Service to be ready. This may take a few minutes...
No cluster endpoint is reachable, please check if there is connectivity/firewall/DNS issue.
Connect-ServiceFabricCluster : No cluster endpoint is reachable, please check if there is connectivity/firewall/DNS
issue.
At C:\Program Files\Microsoft SDKs\Service Fabric\Tools\Scripts\ClusterSetupUtilities.psm1:620 char:12
+     [void](Connect-ServiceFabricCluster @connParams)
+            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [Connect-ServiceFabricCluster], FabricException
    + FullyQualifiedErrorId : TestClusterConnectionErrorId,Microsoft.ServiceFabric.Powershell.ConnectCluster

Pulling my hair out with this over the last couple days. Here's thing's I've tried:

  • Firewall exception on port 19000.
  • Uninstall and re-install of Service Fabric SDK.
  • Repair of vcredist.
  • Executed CleanCluster.ps1 before running DevClusterSetup.ps1
  • Uninstall and re-install of Visual Studio 2017.
  • Excluded applicable folders and processes from antivirus.
  • Ensured Powershell execution policy is set to unrestricted.

KyleTheAutomator avatar May 16 '18 15:05 KyleTheAutomator

Can you share the generated json template? C:\Users\kthompson\AppData\Local\Temp\tmp3B1A.tmp.json

mikkelhegn avatar May 16 '18 21:05 mikkelhegn

{
    "name":  "DevCluster",
    "clusterConfigurationVersion":  "1.0.0",
    "apiVersion":  "10-2017",
    "nodes":  [
                  {
                      "nodeName":  "_Node_0",
                      "iPAddress":  "ComputerFullName",
                      "nodeTypeRef":  "NodeType0",
                      "faultDomain":  "fd:/0",
                      "upgradeDomain":  "0"
                  }
              ],
    "properties":  {
                       "diagnosticsStore":  {
                                                "metadata":  "Please replace the diagnostics file share with an actual file share accessible from all cluster machines.",
                                                "dataDeletionAgeInDays":  "3",
                                                "storeType":  "FileShare",
                                                "connectionstring":  "%systemdrive%\\ProgramData\\SF\\DiagnosticsStore"
                                            },
                       "nodeTypes":  [
                                         {
                                             "name":  "NodeType0",
                                             "clientConnectionEndpointPort":  "19000",
                                             "clusterConnectionEndpointPort":  "19002",
                                             "leaseDriverEndpointPort":  "19001",
                                             "serviceConnectionEndpointPort":  "19006",
                                             "httpGatewayEndpointPort":  "19080",
                                             "reverseProxyEndpointPort":  "19081",
                                             "applicationPorts":  {
                                                                      "startPort":  "30001",
                                                                      "endPort":  "31000"
                                                                  },
                                             "isPrimary":  true
                                         }
                                     ],
                       "fabricSettings":  [
                                              {
                                                  "name":  "Setup",
                                                  "parameters":  [
                                                                     {
                                                                         "name":  "FabricDataRoot",
                                                                         "value":  "C:\\SfDevCluster\\Data"
                                                                     },
                                                                     {
                                                                         "name":  "FabricLogRoot",
                                                                         "value":  "C:\\SfDevCluster\\Log"
                                                                     },
                                                                     {
                                                                         "value":  "true",
                                                                         "name":  "IsDevCluster"
                                                                     }
                                                                 ]
                                              },
                                              {
                                                  "name":  "Diagnostics",
                                                  "parameters":  [
                                                                     {
                                                                         "name":  "ProducerInstances",
                                                                         "value":  "ServiceFabricEtlFile,ServiceFabricPerfCtrFolder"
                                                                     },
                                                                     {
                                                                         "name":  "MaxDiskQuotaInMB",
                                                                         "value":  "10240"
                                                                     },
                                                                     {
                                                                         "name":  "EnableCircularTraceSession",
                                                                         "value":  "true"
                                                                     }
                                                                 ]
                                              },
                                              {
                                                  "name":  "FabricClient",
                                                  "parameters":  [
                                                                     {
                                                                         "name":  "HealthReportSendInterval",
                                                                         "value":  "0"
                                                                     }
                                                                 ]
                                              },
                                              {
                                                  "name":  "Failover",
                                                  "parameters":  [
                                                                     {
                                                                         "name":  "SendToFMTimeout",
                                                                         "value":  "1"
                                                                     },
                                                                     {
                                                                         "name":  "NodeUpRetryInterval",
                                                                         "value":  "1"
                                                                     }
                                                                 ]
                                              },
                                              {
                                                  "name":  "Federation",
                                                  "parameters":  [
                                                                     {
                                                                         "name":  "NodeIdGeneratorVersion",
                                                                         "value":  "V4"
                                                                     },
                                                                     {
                                                                         "name":  "UnresponsiveDuration",
                                                                         "value":  "0"
                                                                     },
                                                                     {
                                                                         "name":  "ProcessAssertExitTimeout",
                                                                         "value":  "86400"
                                                                     }
                                                                 ]
                                              },
                                              {
                                                  "name":  "Hosting",
                                                  "parameters":  [
                                                                     {
                                                                         "name":  "EndpointProviderEnabled",
                                                                         "value":  "true"
                                                                     },
                                                                     {
                                                                         "name":  "RunAsPolicyEnabled",
                                                                         "value":  "true"
                                                                     },
                                                                     {
                                                                         "name":  "EnableProcessDebugging",
                                                                         "value":  "true"
                                                                     },
                                                                     {
                                                                         "name":  "DeactivationScanInterval",
                                                                         "value":  "600"
                                                                     },
                                                                     {
                                                                         "name":  "DeactivationGraceInterval",
                                                                         "value":  "2"
                                                                     },
                                                                     {
                                                                         "name":  "ServiceTypeRegistrationTimeout",
                                                                         "value":  "20"
                                                                     },
                                                                     {
                                                                         "name":  "CacheCleanupScanInterval",
                                                                         "value":  "300"
                                                                     },
                                                                     {
                                                                         "name":  "DeploymentRetryBackoffInterval",
                                                                         "value":  "1"
                                                                     }
                                                                 ]
                                              },
                                              {
                                                  "name":  "Management",
                                                  "parameters":  [
                                                                     {
                                                                         "name":  "ImageStoreConnectionString",
                                                                         "value":  "ImageStoreConnectionStringPlaceHolder"
                                                                     },
                                                                     {
                                                                         "name":  "ImageCachingEnabled",
                                                                         "value":  "false"
                                                                     },
                                                                     {
                                                                         "name":  "EnableDeploymentAtDataRoot",
                                                                         "value":  "true"
                                                                     },
                                                                     {
                                                                         "name":  "DisableChecksumValidation",
                                                                         "value":  "true"
                                                                     }
                                                                 ]
                                              },
                                              {
                                                  "name":  "PlacementAndLoadBalancing",
                                                  "parameters":  [
                                                                     {
                                                                         "name":  "MinLoadBalancingInterval",
                                                                         "value":  "300"
                                                                     },
                                                                     {
                                                                         "name":  "TraceCRMReasons",
                                                                         "value":  "false"
                                                                     }
                                                                 ]
                                              },
                                              {
                                                  "name":  "ReconfigurationAgent",
                                                  "parameters":  [
                                                                     {
                                                                         "name":  "IsDeactivationInfoEnabled",
                                                                         "value":  "true"
                                                                     },
                                                                     {
                                                                         "name":  "ServiceApiHealthDuration",
                                                                         "value":  "20"
                                                                     },
                                                                     {
                                                                         "name":  "ServiceReconfigurationApiHealthDuration",
                                                                         "value":  "20"
                                                                     },
                                                                     {
                                                                         "name":  "LocalHealthReportingTimerInterval",
                                                                         "value":  "5"
                                                                     },
                                                                     {
                                                                         "name":  "RAUpgradeProgressCheckInterval",
                                                                         "value":  "3"
                                                                     },
                                                                     {
                                                                         "name":  "RAPMessageRetryInterval",
                                                                         "value":  "0.5"
                                                                     },
                                                                     {
                                                                         "name":  "MinimumIntervalBetweenRAPMessageRetry",
                                                                         "value":  "0.5"
                                                                     }
                                                                 ]
                                              },
                                              {
                                                  "name":  "ServiceFabricEtlFile",
                                                  "parameters":  [
                                                                     {
                                                                         "name":  "DataDeletionAgeInDays",
                                                                         "value":  "3"
                                                                     },
                                                                     {
                                                                         "name":  "IsEnabled",
                                                                         "value":  "true"
                                                                     },
                                                                     {
                                                                         "name":  "ProducerType",
                                                                         "value":  "EtlFileProducer"
                                                                     },
                                                                     {
                                                                         "name":  "EtlReadIntervalInMinutes",
                                                                         "value":  "5"
                                                                     }
                                                                 ]
                                              },
                                              {
                                                  "name":  "ServiceFabricPerfCtrFolder",
                                                  "parameters":  [
                                                                     {
                                                                         "name":  "DataDeletionAgeInDays",
                                                                         "value":  "3"
                                                                     },
                                                                     {
                                                                         "name":  "IsEnabled",
                                                                         "value":  "true"
                                                                     },
                                                                     {
                                                                         "name":  "ProducerType",
                                                                         "value":  "FolderProducer"
                                                                     },
                                                                     {
                                                                         "name":  "FolderType",
                                                                         "value":  "ServiceFabricPerformanceCounters"
                                                                     }
                                                                 ]
                                              },
                                              {
                                                  "name":  "Trace/Etw",
                                                  "parameters":  [
                                                                     {
                                                                         "name":  "Level",
                                                                         "value":  "4"
                                                                     }
                                                                 ]
                                              },
                                              {
                                                  "name":  "TransactionalReplicator",
                                                  "parameters":  [
                                                                     {
                                                                         "name":  "CheckpointThresholdInMB",
                                                                         "value":  "64"
                                                                     }
                                                                 ]
                                              }
                                          ],
                       "addOnFeatures":  [
                                             "DnsService"
                                         ]
                   }
}

KyleTheAutomator avatar May 16 '18 21:05 KyleTheAutomator

@maburlik - I don't see anything obvious from the manifest.

mikkelhegn avatar May 16 '18 21:05 mikkelhegn

I wonder if you are seeing the same issue as reported in microsoft/service-fabric-issues#1056. Would you mind checking:

  1. Whether Fabric.exe process is running or not.
  2. If not running, presence of following errors in Event Log: image

knizkar avatar May 17 '18 09:05 knizkar

Spot on. I see the following in my logs:

Fabric Node open failed with error code = E_ACCESSDENIED

Also seeing:

HostedService: _Node_0 on node id bf865279ba277deb864a976fbf4c200e terminated unexpectedly with code 7167 and process name Fabric.exe

HostedServiceInstance:HostedService/_Node_0_Fabric terminated with exitcode 7167

client-localhost:19000/127.0.0.1:19000: error = 2147943625, failureCount=93. Filter by (type~Transport.St && ~"(?i)localhost:19000") to get listener lifecycle. Connect failure is expected if listener was never started, or listener/its process was stopped before/during connecting.

KyleTheAutomator avatar May 17 '18 15:05 KyleTheAutomator

One of our primary use cases in evaluating Service Fabric is to use it for containers. Is there documentation on how to configure a dev cluster for containers using self signed tls certs?

KyleTheAutomator avatar May 17 '18 15:05 KyleTheAutomator

Thanks @knizkar - let's track this on microsoft/service-fabric-issues#1056.

@MisterPuffyPants - Regarding setting up a dev cluster with containers, a doc will be posted one of the following days, as this is only officially supported in 6.2. Main thing is to make sure that the docker service is started when creating the cluster, that will enable the support in Service Fabric.

mikkelhegn avatar May 17 '18 18:05 mikkelhegn

Exactly same issue here. Any updates?

medeirosle avatar Jun 25 '18 18:06 medeirosle

Had the same issue, the only thing that helped - going back to 6.2.283/3.1.283

vitalybibikov avatar Jul 24 '18 10:07 vitalybibikov

Any updates? Still see it in the newest version

vitalybibikov avatar Aug 27 '18 10:08 vitalybibikov

@EvilAvenger: Catching up on this issue, have you gone through the solutions proposed in this issue? https://github.com/Azure/service-fabric-issues/issues/1056

mikkelhegn avatar Aug 27 '18 12:08 mikkelhegn

@MikkelHegn

Yes I did, it does not work. Currently the issues is revealing on our deployment machine, so I can't properly test it (as it blocks my team).

The only thing that really helps is installation of 6.2.283.9494. (Installation of prior version, but copying files from 6.2..283 to "C:\Program Files\Microsoft SDKs\Service Fabric" helps as well.)

All the other versions are not working, so it might be, that the issue has been brought somewhere in *.301;

What I've tried:

  • Checked that my WinFirewall service is working and is not blocking ports;
  • Checked that "everyone" has write permissions;
  • Checked that nothing is working on 19000 with netstat;
  • Execution policy is set to ByPass;

Event log issues: Currently I can't provide full event log as I've reinstalled the service, I've seen several records in EL:

  1. FileChangeMonitor failed with E_ACCESSDENIED
  2. FolderACLManager::Install failed with error E_INVALIDARG
  3. GetFileAttributesEx failed with the following error 5

vitalybibikov avatar Aug 28 '18 07:08 vitalybibikov

Thanks for your patience on this one @EvilAvenger. @maburlik for the diagnostics info above, do you have any ideas what might be causing this?

mikkelhegn avatar Aug 28 '18 08:08 mikkelhegn

Also blocked by this now @MikkelHegn . Anyone any closer to figuring out what is going on? I have tried all the workarounds and it's no use.

andrewcoll avatar Sep 08 '18 19:09 andrewcoll

Folks, if the workaround mentioned in microsoft/service-fabric-issues#1056 isn't working for you, can you please share full setup logs from the environment? May be you are running into something else here.

(Assuming Windows) The reg key HKLM\SOFTWARE\Microsoft\ServiceFabric\FabricLogRoot should point to the location of the logs. Zip the directory and attach the file here; you can also zip and email it to us (raunakp, or mikhegn at microsoft dot com) if you want.

raunakpandya avatar Sep 10 '18 07:09 raunakpandya

Log (2).zip

Logs attached.

andrewcoll avatar Sep 10 '18 20:09 andrewcoll

Just to give my two cents on this issue. I was also having the same problem with Windows 10 and the latest SDK. I had checked the windows firewall, removed webroot av, reinstalled the SDK multiple time, reverted back to older SDKs, checked the folder permissions, changed to network service account and any other solutions proposed in this issue https://github.com/Azure/service-fabric-issues/issues/1056

The fix for me was quite simple, @JayRidge95 noticed the hostname was being chopped in the event logs. My computer name was longer than the 15 character net bios name. So we changed my computer name to be shorter than 15 characters, reinstalled the SDK and it worked fine.

Bit of an odd one but it took me about 3 days to get to that point so this might save some people time.

tjackadams avatar Sep 27 '18 10:09 tjackadams

@tjackadams this works like a charm.I have just shorten the computer name.I was stuck in this issue since last 4 days.

sandipuchdadiya avatar Sep 28 '18 05:09 sandipuchdadiya

@tjackadams thanks. It worked. Dear SF team can you fix this issue or at least provide a better error message to identify the issue and solution quickly.

petrformanek avatar Oct 01 '18 10:10 petrformanek

This workaround did not work for me. :( It's still not working.

@raunakpandya is there any update on this?

andrewcoll avatar Oct 08 '18 17:10 andrewcoll

@andrewcoll +1 Not working for me as well

kuvinodms avatar Oct 09 '18 10:10 kuvinodms

@andrewcoll - Have you tried the workaround to set the FabricContainerAppsEnabled to false? If not, can you try adding the following section under the hosting section in the ClusterManifestTemplate.json files (depending on the type of one box you bringing up, there would be one file) under %programifiles%\Microsoft SDKs\Service Fabric\ClusterSetup:

Add the following section under the Hosting tab -

      {
        "name": "FabricContainerAppsEnabled",
        "value": "false"
      }

raunakpandya avatar Oct 09 '18 11:10 raunakpandya

@raunakpandya yes, I tried that, it didn't work either. I attached my logs in a previous comment.

andrewcoll avatar Oct 09 '18 12:10 andrewcoll

Yes. I did look at the logs. Strange, which json file did you modify, can you attach the same? Also, what one box mode are you trying to bring up (secure/unsecure/ 1 box/5 box)?

raunakpandya avatar Oct 09 '18 13:10 raunakpandya

The @raunakpandya 's answer work for me. Thanks!!!

abnerescocio avatar Oct 11 '18 19:10 abnerescocio

@tjackadams your solution worked for me. Shorten computer name (was longer than 15 characters). Thank you!

caretro avatar Dec 03 '18 16:12 caretro

FabricContainerAppsEnabled

@raunakpandya could you please explain why disabling this settings solve this issue ?

Kassoul avatar Dec 14 '18 14:12 Kassoul

@Kassoul - This has the details: https://github.com/Azure/service-fabric-issues/issues/1056#issuecomment-400413031

By disabling that, the self signed certificate is no longer created.

raunakpandya avatar Dec 14 '18 14:12 raunakpandya

I have seen the same error when trying to start up my local cluster. In my case, I noticed that some dll is missing from the Fabric.exe - from 'HostService: <Node> on node id terminated unexpectedly with code 3221225781 and process name Fabric.exe' error message. For me, The issue was that some of the vc++ dlls went missing and can be fixed by reinstall "C:\Program Files\Microsoft Service Fabric\bin\Fabric\Fabric.Code\vcredist_x64.exe".

sorawitamorn avatar Sep 06 '21 09:09 sorawitamorn

I have seen the same error when trying to start up my local cluster. In my case, I noticed that some dll is missing from the Fabric.exe - from 'HostService: on node id terminated unexpectedly with code 3221225781 and process name Fabric.exe' error message. For me, The issue was that some of the vc++ dlls went missing and can be fixed by reinstall "C:\Program Files\Microsoft Service Fabric\bin\Fabric\Fabric.Code\vcredist_x64.exe".

This fixes the issue for me!

seb-emmot avatar Apr 27 '22 08:04 seb-emmot