protoactor-dotnet icon indicating copy to clipboard operation
protoactor-dotnet copied to clipboard

Unable to update pod labels, registration failed on K8S

Open MoienTajik opened this issue 1 year ago • 3 comments

Expected Behavior

  • Deploy successfully on K8S

Actual Behavior

Application throws this exception on application startup after upgrading all Proto.Actor packages including Proto.Cluster.Kubernetes from version 0.30.0 to 1.0.0-rc2.36:

[17:02:53 WRN]:[Proto.Cluster.Kubernetes.KubernetesProvider] Failed to register service 
k8s.Autorest.HttpOperationException: Operation returned an invalid status code 'UnprocessableEntity'
   at k8s.Kubernetes.SendRequestRaw(String requestContent, HttpRequestMessage httpRequest, CancellationToken cancellationToken)
   at k8s.Kubernetes.PatchNamespacedPodWithHttpMessagesAsync(V1Patch body, String name, String namespaceParameter, String dryRun, String fieldManager, String fieldValidation, Nullable`1 force, Nullable`1 pretty, IDictionary`2 customHeaders, CancellationToken cancellationToken)
   at k8s.KubernetesExtensions.PatchNamespacedPodAsync(IKubernetes operations, V1Patch body, String name, String namespaceParameter, String dryRun, String fieldManager, String fieldValidation, Nullable`1 force, Nullable`1 pretty, CancellationToken cancellationToken)
   at Proto.Cluster.Kubernetes.KubernetesProvider.RegisterMemberInner()
   at Proto.Utils.Retry.Try(Func`1 body, Int32 retryCount, Int32 backoffMilliSeconds, Int32 maxBackoffMilliseconds, Action`2 onError, Action`1 onFailed, Boolean ignoreFailure)
[17:02:56 INF]:[Proto.Cluster.Kubernetes.KubernetesProvider] [Cluster][KubernetesProvider] Registering service core-sales-7479dbdd94-x5wlx on [K8S-IP]:45889 
[17:02:56 INF]:[Proto.Cluster.Kubernetes.KubernetesProvider] [Cluster][KubernetesProvider] Using Kubernetes namespace: hotel-stage 
[17:02:56 INF]:[Proto.Cluster.Kubernetes.KubernetesProvider] [Cluster][KubernetesProvider] Using Kubernetes port: 45889 
[17:02:56 ERR]:[Proto.Cluster.Kubernetes.KubernetesProvider] [Cluster][KubernetesProvider] Unable to update pod labels, registration failed. Labels : {"cluster.proto.actor/cluster": "Alibaba.Hotel.Cluster", "cluster.proto.actor/port": "45889", "cluster.proto.actor/member-id": "ac98452bdc5b404690080f774d200048", "cluster.proto.actor/kind-alibaba.hotel.provider.contracts/AvailableOrchestratorGrain": "true", "cluster.proto.actor/kind-alibaba.hotel.provider.contracts/ProviderGrain": "true", "cluster.proto.actor/kind-prototopic": "true", "app": "core-sales", "pod-template-hash": "7479dbdd94"} 
k8s.Autorest.HttpOperationException: Operation returned an invalid status code 'UnprocessableEntity'
   at k8s.Kubernetes.SendRequestRaw(String requestContent, HttpRequestMessage httpRequest, CancellationToken cancellationToken)
   at k8s.Kubernetes.PatchNamespacedPodWithHttpMessagesAsync(V1Patch body, String name, String namespaceParameter, String dryRun, String fieldManager, String fieldValidation, Nullable`1 force, Nullable`1 pretty, IDictionary`2 customHeaders, CancellationToken cancellationToken)
   at k8s.KubernetesExtensions.PatchNamespacedPodAsync(IKubernetes operations, V1Patch body, String name, String namespaceParameter, String dryRun, String fieldManager, String fieldValidation, Nullable`1 force, Nullable`1 pretty, CancellationToken cancellationToken)
   at Proto.Cluster.Kubernetes.KubernetesProvider.RegisterMemberInner()
[17:02:56 WRN]:[Proto.Cluster.Kubernetes.KubernetesProvider] Failed to register service 
k8s.Autorest.HttpOperationException: Operation returned an invalid status code 'UnprocessableEntity'
   at k8s.Kubernetes.SendRequestRaw(String requestContent, HttpRequestMessage httpRequest, CancellationToken cancellationToken)
   at k8s.Kubernetes.PatchNamespacedPodWithHttpMessagesAsync(V1Patch body, String name, String namespaceParameter, String dryRun, String fieldManager, String fieldValidation, Nullable`1 force, Nullable`1 pretty, IDictionary`2 customHeaders, CancellationToken cancellationToken)
   at k8s.KubernetesExtensions.PatchNamespacedPodAsync(IKubernetes operations, V1Patch body, String name, String namespaceParameter, String dryRun, String fieldManager, String fieldValidation, Nullable`1 force, Nullable`1 pretty, CancellationToken cancellationToken)
   at Proto.Cluster.Kubernetes.KubernetesProvider.RegisterMemberInner()
   at Proto.Utils.Retry.Try(Func`1 body, Int32 retryCount, Int32 backoffMilliSeconds, Int32 maxBackoffMilliseconds, Action`2 onError, Action`1 onFailed, Boolean ignoreFailure)

Specifications

Proto.Actor Version: 1.0.0-rc2.36 Runtime: .NET 6 K8S server version: {"Major:"1", Minor:"23", GitVersion:"v1.23.7"}

MoienTajik avatar Sep 11 '22 17:09 MoienTajik

This does not look like a valid k8s label: cluster.proto.actor/kind-alibaba.hotel.provider.contracts/AvailableOrchestratorGrain. According to spec only single slash is allowed as it separates label prefix from label name.

The problem might have been introduced by this PR I assume alibaba.hotel.provider.contracts is a package name you have in your proto file.

@mhelleborg can you take a look at this?

@MoienTajik can you try earlier version?

marcinbudny avatar Sep 11 '22 18:09 marcinbudny

Yes, you're right. It's the package name in our proto contracts.

And here are the generated Cluster Kinds:

AvailableOrchestratorGrainActor.Kind => "alibaba.hotel.provider.contracts/AvailableOrchestratorGrain";

ProviderGrainActor.Kind => "alibaba.hotel.provider.contracts/ProviderGrain";

We didn't change cluster kinds while upgrading the Nuget packages, so that could be a code generation issue, as mentioned.

I've tested v1.0.0-rc2.31 and everything is fine with this version.

MoienTajik avatar Sep 12 '22 06:09 MoienTajik

This is a dilemma on where we fix the issue. Root cause here is that the namespaced codegen grains use the "name.space/GrainName" format. This does not work with the current implementation of the K8s provider, as the second '/' in the label is not supported.

The simplest fix would be to just change the grain to use "name.space.GrainName". But it also showcases the limitations of the current label based provider, so perhaps we should look into if there is a better way we can expose the cluster kinds there..

mhelleborg avatar Sep 14 '22 08:09 mhelleborg