Orleans.Clustering.Kubernetes icon indicating copy to clipboard operation
Orleans.Clustering.Kubernetes copied to clipboard

Deleting stale silos that are marked active but the pod has already been removed

Open tomodutch opened this issue 3 years ago • 6 comments

After a pod exits prematurely it will create a silo entry with the status "Active". When this pod restarts it scans all silos from etcd and attempts to connect with its predecessor. I've noticed this behaviour several times when running in production on orleans 3 in a single silo cluster.

tomodutch avatar Jan 17 '21 05:01 tomodutch

The Orleans.Hosting.Kubernetes package does this: https://dotnet.github.io/orleans/docs/deployment/kubernetes.html It's still in "beta", but if you can try it and provide feedback, that will be helpful

ReubenBond avatar Jan 17 '21 05:01 ReubenBond

Thanks! I will try it out in the next few days 😄

tomodutch avatar Jan 17 '21 09:01 tomodutch

@ReubenBond the package you directed @TFarla to doesn't do clustering. Could you please tell me what is the benefit of using it at all?

turowicz avatar Jan 26 '21 10:01 turowicz

Or should it be used in tandem?

turowicz avatar Jan 26 '21 10:01 turowicz

@turowicz the hosting package does a few things:

  • Configure silos based on the pod's environment (IP, name, ClusterId/ServiceId)
  • Monitor Kubernetes for changes in active pods, so that deleted pods can be removed immediately, without the need for health probes. This doesn't remove the need for health probes altogether, but provides a short-cut in case an administrative action was taken.
  • Kill pods in Kubernetes when they are declared dead by the cluster

It doesn't replace the need for a clustering provider. The sample here shows how to use it, and uses Redis for clustering (but something else could be used, of course) https://github.com/ReubenBond/hanbaobao-web

ReubenBond avatar Jan 26 '21 13:01 ReubenBond

@ReubenBond thanks! I'm testing Orleans.Clustering.Kubernetes with Orleans.Hosting.Kubernetes now. It seems like this combo will work well.

turowicz avatar Jan 26 '21 13:01 turowicz