dolphinscheduler
dolphinscheduler copied to clipboard
[DSIP-9][registry] add raft registration plugin
Purpose of the pull request
This is a project for the OSPP 2024. This PR is close #10874
Brief change log
- Added a new module: /dolphinscheduler-registry/dolphinscheduler-plugins/dolphinscheduler-registry-raft
- Added raft's dependency in dolphinscheduler-bom/pom.xml
- Added dolphinscheduler-registry-raft dependency in dolphinscheduler-registry/dolphinscheduler-registry-all/pom.xml
Verify this pull request
Unit Test Results:
Pseudo-Cluster Deployment Status:
Cluster Test
Test conditions: 2 master nodes, 1 worker node OS: mac Steps: Start node1, node2, and node3 in sequence
- [x] Be able to normally elect a Leader
- [x] When the Leader of the cluster crashes, be able to normally elect a new Leader
- [x] The original crashed Leader re-joins the cluster and is automatically downgraded to a Follower
Thanks for opening this pull request! Please check out our contributing guidelines. (https://github.com/apache/dolphinscheduler/blob/dev/docs/docs/en/contribute/join/pull-request.md)
If there are no master can this work? We should select the leader from the whole cluster rather than select the leader from master, I didn't see any design related to this PR, you should provide the newly design doc related to this PR.
Hi, this is an OSPP project. My mentor is @zhuxt2015. The design plan for the raft plugin has been discussed. The design link is Add new registry plugin based on raft. According to the discussed plan, masters will form a raft cluster, and the leader will only be elected among the masters, just like in ZK. For the three master nodes, at least two must be alive to provide services normally.
Please retry analysis of this Pull-Request directly on SonarCloud
Please add mysql and postgresql with raft registry cluster test in CI. You can refer to https://github.com/apache/dolphinscheduler/tree/dev/.github/workflows/cluster-test
If there are no master can this work? We should select the leader from the whole cluster rather than select the leader from master, I didn't see any design related to this PR, you should provide the newly design doc related to this PR.
Hi, this is an OSPP project. My mentor is @zhuxt2015. The design plan for the raft plugin has been discussed. The design link is Add new registry plugin based on raft. According to the discussed plan, masters will form a raft cluster, and the leader will only be elected among the masters, just like in ZK. For the three master nodes, at least two must be alive to provide services normally.
If the cluster does have master, then the cluster will not work? This is unacceptable.
If there are no master can this work? We should select the leader from the whole cluster rather than select the leader from master, I didn't see any design related to this PR, you should provide the newly design doc related to this PR.
Hi, this is an OSPP project. My mentor is @zhuxt2015. The design plan for the raft plugin has been discussed. The design link is Add new registry plugin based on raft. According to the discussed plan, masters will form a raft cluster, and the leader will only be elected among the masters, just like in ZK. For the three master nodes, at least two must be alive to provide services normally.
If the cluster does have master, then the cluster will not work? This is unacceptable.
Just like zk plugin, if the zk cluster goes down, the ds cluster will also go down.
If there are no master can this work? We should select the leader from the whole cluster rather than select the leader from master, I didn't see any design related to this PR, you should provide the newly design doc related to this PR.
Hi, this is an OSPP project. My mentor is @zhuxt2015. The design plan for the raft plugin has been discussed. The design link is Add new registry plugin based on raft. According to the discussed plan, masters will form a raft cluster, and the leader will only be elected among the masters, just like in ZK. For the three master nodes, at least two must be alive to provide services normally.
If the cluster does have master, then the cluster will not work? This is unacceptable.
Just like zk plugin, if the zk cluster goes down, the ds cluster will also go down.
If so, big -1 for this plugin, this doesn't help with SLA, I can not find any reason why we wouldn't use jdbc registry plugin instead of this plugin.
If so, big -1 for this plugin, this doesn't help with SLA, I can not find any reason why we wouldn't use jdbc registry plugin instead > of this plugin.
+1, High availability and stability are the first things we should ensure. Any new function should not be based on lowering the standards of these two.
If so, big -1 for this plugin, this doesn't help with SLA, I can not find any reason why we wouldn't use jdbc registry plugin instead > of this plugin.
+1, High availability and stability are the first things we should ensure. Any new function should not be based on lowering the standards of these two.
This plugin does not reduce the stability and availability of the ds cluster, and will maintain the same impact on the cluster as other plugins.
This plugin does not reduce the stability and availability of the ds cluster, and will maintain the same impact on the cluster as other plugins.
If the cluster does have master, then the cluster will not work.
This is against zk and jdbc registry since neither of these two types will cause the worker server and alert server to fail to operate normally because of the downtime of the master server.
This plugin does not reduce the stability and availability of the ds cluster, and will maintain the same impact on the cluster as other plugins.
If the cluster does have master, then the cluster will not work.
This is against zk and jdbc registry since neither of these two types will cause the worker server and alert server to fail to operate normally because of the downtime of the master server.
Raft plugin won‘t do that either.
Raft plugin won‘t do that either.
Based on this PR, this problem exists. Why do you say this?
Raft plugin won‘t do that either.
Based on this PR, this problem exists. Why do you say this?
I’m sorry that I didn‘t explain it clearly. I mean we will fix this problem.
This plugin does not reduce the stability and availability of the ds cluster, and will maintain the same impact on the cluster as other plugins.
If the cluster does have master, then the cluster will not work.
This is against zk and jdbc registry since neither of these two types will cause the worker server and alert server to fail to operate normally because of the downtime of the master server.
If the master cluster fails, then the worker still uses the original logic. Why is there a problem? What's the problem? If you want to think about the big picture, we also want to get rid of zookeeper
This plugin does not reduce the stability and availability of the ds cluster, and will maintain the same impact on the cluster as other plugins.
If the cluster does have master, then the cluster will not work.
This is against zk and jdbc registry since neither of these two types will cause the worker server and alert server to fail to operate normally because of the downtime of the master server.
If the master cluster fails, then the worker still uses the original logic. Why is there a problem? What's the problem? If you want to think about the big picture, we also want to get rid of zookeeper
You should get more background about this, the origin design of this PR will only store data on master, if all master crash, then all server rely on registry will crash. This is why If the cluster does not have master, then the whole cluster will not work.
This pull request has been automatically marked as stale because it has not had recent activity for 120 days. It will be closed in 7 days if no further activity occurs.
This pull request has been closed because it has not had recent activity. You could reopen it if you try to continue your work, and anyone who are interested in it are encouraged to continue work on this pull request.