dify icon indicating copy to clipboard operation
dify copied to clipboard

Clarification of software architecture, namely pluginDaemon

Open BorisPolonsky opened this issue 11 months ago • 7 comments

Self Checks

  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit report (我已阅读并同意 Language Policy).
  • [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [x] Please do not modify this template :) and fill in all the required fields.

Provide a description of requested docs changes

We are the developers of dify-helm. While migrating pluginDaemon to kubernetes but there's no clue on software design from the current documentation. We need clarification on the following topics

Is this component is stateless (e.g. support multi-replica deployment) given the current version of the document.

We would like to know if it's appropriate to declare this component as Deployment https://github.com/langgenius/dify/blob/28edbbac0b4f3da7403de1f00b6cf0a8e4c0e24b/docker/docker-compose-template.yaml#L141-L161

BorisPolonsky avatar Jan 20 '25 08:01 BorisPolonsky

I've noticed a log of plugin-daemon as below

2025/01/25 16:14:09 cluster_lifetime.go:113: [INFO]current node has become the master of the cluster

Does it means that roles of multiple plugin-daemon container are not identical and shall be defined as StatefulSets instead of Deployment? @takatost

BorisPolonsky avatar Jan 25 '25 16:01 BorisPolonsky

I’m just a contributor and not a member of the Dify team, but as far as I understand, the state of each plugin daemon container is all stored on Redis. The voting and decision on which nodes will be the master node is made automatically using Redis, so I haven't tested it, but I think multi-replica is probably supported.

This is a repository of the plugin daemon: https://github.com/langgenius/dify-plugin-daemon and here are some reference codes and comments for your concern: https://github.com/langgenius/dify-plugin-daemon/blob/fc3cba5f2c9e1dc71edc8d6e5eed5814bd149f0f/internal/cluster/cluster_lifetime.go#L11-L40

kurokobo avatar Jan 25 '25 17:01 kurokobo

I’m just a contributor and not a member of the Dify team, but as far as I understand, the state of each plugin daemon container is all stored on Redis. The voting and decision on which nodes will be the master node is made automatically using Redis, so I haven't tested it, but I think multi-replica is probably supported.

This is a repository of the plugin daemon: https://github.com/langgenius/dify-plugin-daemon and here are some reference codes and comments for your concern: https://github.com/langgenius/dify-plugin-daemon/blob/fc3cba5f2c9e1dc71edc8d6e5eed5814bd149f0f/internal/cluster/cluster_lifetime.go#L11-L40

Thanks for your hints. It appears to me that both node and master of pluginDaemon identify itself by IPv4 address. I'm still not sure if a stable network identifier (e.g. a stable network IP for each pod throughout the whole lifecycle of StatefulSets) or ordered deployment and termination is mandatory in this scenario.

BorisPolonsky avatar Jan 26 '25 02:01 BorisPolonsky

I'm still not sure if a stable network identifier (e.g. a stable network IP for each pod throughout the whole lifecycle of StatefulSets) or ordered deployment and termination is mandatory in this scenario.

Hmm, I think it's unnecessary to have such delicate control over the state of the pod since it simply registers its own IP address to Redis every 5 seconds, and it becomes invalid if there are no updates for 10 seconds. Quite simple.

I believe that the Dify Team also needs to provide this daemon in a way that can withstand high loads in a SaaS model with HA (perhaps on Kubernetes), so I think they are steering away from architectures that require delicate control by design.

kurokobo avatar Jan 26 '25 04:01 kurokobo

Declared as Deployment. https://github.com/BorisPolonsky/dify-helm/pull/123

BorisPolonsky avatar Feb 17 '25 10:02 BorisPolonsky

I declared this component as a Deployment in the yaml file. https://github.com/Winson-030/dify-kubernetes/commit/8c4b6a04340528a520df1f075c96e2af8dd9b9e3

Winson-030 avatar Mar 01 '25 19:03 Winson-030

Hi, @BorisPolonsky. I'm Dosu, and I'm helping the Dify team manage their backlog. I'm marking this issue as stale.

Issue Summary:

  • You inquired about the pluginDaemon component's architecture for Kubernetes migration.
  • Kurokobo indicated that the component's state is managed via Redis, suggesting multi-replica deployment support.
  • You confirmed that the component can be supported as a Deployment.
  • Winson-030 updated the Kubernetes configuration to reflect this.

Next Steps:

  • Please confirm if this issue is still relevant to the latest version of the Dify repository. If so, you can keep the discussion open by commenting on the issue.
  • Otherwise, this issue will be automatically closed in 15 days.

Thank you for your understanding and contribution!

dosubot[bot] avatar Apr 01 '25 16:04 dosubot[bot]