service-fabric-cli icon indicating copy to clipboard operation
service-fabric-cli copied to clipboard

sfctl partition restart return (FMFailoverUnitNotFound) Null

Open sealbus opened this issue 5 years ago • 6 comments

I'm trying to restart a partition using sfctl partition restart command. This is not working as expected, here the debug result:

Command arguments: ['partition', 'restart', '--restart-partition-mode', 'AllReplicasOrInstances', '--service-id', 'IOTApplication/RedisService', '--partition-id', 'b09fb219-248c-244b-b298-27a0db25c053', '--operation-id', '0ebed82a-e90f-470a-aba9-8caf62ae5840', '--debug'] Event: Cli.PreExecute [] Event: CommandParser.OnGlobalArgumentsCreate [<function CLILogging.on_global_arguments at 0x0000017F8E46D1E0>, <function OutputProducer.on_global_arguments at 0x0000017F8E542268>, <function CLIQuery.on_global_arguments at 0x0000017F8E569D08>] Event: CommandInvoker.OnPreCommandTableCreate [] Event: CommandLoader.OnLoadArguments [] Event: CommandInvoker.OnPostCommandTableCreate [] Event: CommandInvoker.OnCommandTableLoaded [] Event: CommandInvoker.OnPreParseArgs [] Event: CommandInvoker.OnPostParseArgs [<function OutputProducer.handle_output_argument at 0x0000017F8E5422F0>, <function CLIQuery.handle_query_parameter at 0x0000017F8E569D90>] msrest.service_client : Accept header absent and forced to application/json msrest.pipeline : Configuring request: timeout=100, verify=True, cert=None msrest.pipeline : Configuring proxies: '' msrest.pipeline : Evaluate proxies against ENV settings: True msrest.pipeline : Configuring redirects: allow=True, max=30 msrest.pipeline : Configuring retry: max_retries=False, backoff_factor=0.8, max_backoff=90 urllib3.connectionpool : Starting new HTTP connection (1): localhost:19081 urllib3.connectionpool : http://localhost:19081 "POST /Faults/Services/IOTApplication/RedisService/$/GetPartitions/b09fb219-248c-244b-b298-27a0db25c053/$/StartRestart?api-version=6.0&OperationId=0ebed82a-e90f-470a-aba9-8caf62ae5840&RestartPartitionMode=AllReplicasOrInstances&timeout=60 HTTP/1.1" 500 60 msrest.exceptions : (FMFailoverUnitNotFound) Null (FMFailoverUnitNotFound) Null Traceback (most recent call last): File "c:\users\xxxx\appdata\local\programs\python\python36\lib\site-packages\knack\cli.py", line 206, in invoke cmd_result = self.invocation.execute(args) File "c:\users\xxxx\appdata\local\programs\python\python36\lib\site-packages\sfctl\entry.py", line 81, in execute return super(SFInvoker, self).execute(args) File "c:\users\xxxx\appdata\local\programs\python\python36\lib\site-packages\knack\invocation.py", line 188, in execute cmd_result = parsed_args.func(params) File "c:\users\xxxx\appdata\local\programs\python\python36\lib\site-packages\knack\commands.py", line 105, in __call__ return self.handler(*args, **kwargs) File "c:\users\xxxx\appdata\local\programs\python\python36\lib\site-packages\knack\commands.py", line 212, in _command_handler result = op(client, **command_args) if client else op(**command_args) File "c:\users\xxxx\appdata\local\programs\python\python36\lib\site-packages\azure\servicefabric\service_fabric_client_ap_is.py", line 11507, in start_partition_restart raise models.FabricErrorException(self._deserialize, response) azure.servicefabric.models.fabric_error_py3.FabricErrorException: (FMFailoverUnitNotFound) Null Performing cluster version check msrest.pipeline : Configuring request: timeout=100, verify=True, cert=None msrest.pipeline : Configuring proxies: '' msrest.pipeline : Evaluate proxies against ENV settings: True msrest.pipeline : Configuring redirects: allow=True, max=30 msrest.pipeline : Configuring retry: max_retries=3, backoff_factor=0.8, max_backoff=90 urllib3.connectionpool : Starting new HTTP connection (1): localhost:19081 urllib3.connectionpool : http://localhost:19081 "GET /$/GetClusterVersion?api-version=6.4&timeout=60 HTTP/1.1" 200 23

sfctl version: 7.1.0 Service Fabric 6.4 runtime

sealbus avatar Mar 14 '19 15:03 sealbus

Thank you for reporting this error! We will take a look and get back soon. In the mean time, since you are on Windows, you can consider trying the PowerShell client to unblock you.

Christina-Kang avatar Mar 21 '19 18:03 Christina-Kang

Thanks for you reply, i execute sfctl on windows environment, but my Service Fabric Cluster is on Linux... According to https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-linux-windows-differences Restart-ServiceFabricPartition not work against a Linux Service Fabric cluster.

sealbus avatar Mar 22 '19 15:03 sealbus

Hi @sealbus,

This doesn't look like an issue with sfctl itself. Could you please share service fabric traces for your cluster? They are located at C:\SfDevCluster\Log\Traces and will have a naming starting with fabric_traces_... along with an approximate time frame of when the operation took place.

Is this a secure local cluster?

Another thing I wanted to double check is the port. I see that yours is set to 19081 - is this intentionally done? By default, we expect it to be 19080 if you are using all default settings.

Thanks!

Christina-Kang avatar Mar 25 '19 18:03 Christina-Kang

same issue when run command: sudo sfctl chaos get

MZDN avatar May 17 '19 13:05 MZDN

Thank you, @MZDN for reporting! Taking a look

Christina-Kang avatar May 17 '19 18:05 Christina-Kang

@MZDN can you share some additional info please? Are you also running sfctl version 7.1.0 and runtime 6.4? Which Python version are you using? Can you also double check that fault analysis service is enabled on your cluster? This will be in the cluster manifest as Section Name="FaultAnalysisService" with parameters MinReplicaSetSize and TargetReplicaSetSize. The section will be under FabricSettings. If not enabled, can you enable it and try the get command again? Thanks!

Christina-Kang avatar May 20 '19 23:05 Christina-Kang