Dead Node
Hi,
I using this lib to intergrate in my application. I saw a issue. when I shutdown a node and turn on again, other node don't know this node. Example: I have 2 node 1,2. First I open all node. Second I turn off node 2 then turn on again but Node 2 don't know node 1 is UP and node 1 also don't know node 2 is UP. Please help me resolve this issue. Thank for your support.
Questions
- are you sure your system clock is in sync
- how long was the node down for
- Can you set the logging on both servers to debug. and record and relevant output?
- what is your configuration? what are the inital contact points
We have a unit tests which does this with 5 nodes so it would be interesting to understand if the same logic does with two nodes
Currently, I applying gossip for Spring boot application. Each node is a instance of Spring Boot. Let's me some advice for apply gossip to Spring Boot Application.
This is my configuration for Application 1 with port 8081. [{ "cluster":"", "id":"", "port":8081, "gossip_interval":1000, "cleanup_interval":10000, "members":[ {"cluster": "","id": "", "host":"192.168.1.90", "port":8084}, {"cluster": "","id": "", "host":"192.168.1.90", "port":8083}, {"cluster": "","id": "", "host":"192.168.1.90", "port":8082} ] }]
This is my configuration for Application 2 with port 8082. [{ "cluster":"", "id":"", "port":8082, "gossip_interval":1000, "cleanup_interval":10000, "members":[ {"cluster": "","id": "", "host":"192.168.1.90", "port":8084}, {"cluster": "","id": "", "host":"192.168.1.90", "port":8083}, {"cluster": "","id": "", "host":"192.168.1.90", "port":8081} ] }] Let's me know if I'm wrong.
Each node needs an id. In your case you can generate a string or a uuid that will persist between restarts
My application using
Maybe it isn't generate id when use method
public GossipService(StartupSettings startupSettings) throws InterruptedException, UnknownHostException { this(InetAddress.getLocalHost().getHostAddress(), startupSettings.getPort(), "", startupSettings.getGossipMembers(), startupSettings .getGossipSettings(), null); }
is it right? 0.0.3 version is different to latest code on github. do you have any update version on maven?
Yes. This looks like a bug of that version. The id was not required in original versions but now it is. Can you please try trunk version. I will release the current trunk later today.
I reviewed code. I think on StartupSetting class.
RemoteGossipMember member = new RemoteGossipMember(memberJSON.getString("cluster"), memberJSON.getString("host"), memberJSON.getInt("port"), "");
also need to generate ID for RemoteGossipMember.
Currently, I changed latest code but It's still error. Please help me resolve this issuse. I hope that you have a release on today.
Thank for your support in this issue.
RemoteGossipMember member = new RemoteGossipMember(memberJSON.getString("cluster"), memberJSON.getString("host"), memberJSON.getInt("port"), "");
This code is ok. We would not know the remote id until will connect to that host.
Can you give a strip down example of your Spring boot example?
When I use latest code. It's occur exception. I don't know why.
Exception in thread "pool-6-thread-1" java.lang.NullPointerException at com.google.code.gossip.mana ger.PassiveGossipThread.run(PassiveGossipThread.java:102) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722)
This is my configuration for gossip [{ "cluster":"1", "id":"1", "port":8081, "gossip_interval":1000, "cleanup_interval":10000, "members":[ {"cluster": "1","id": "4", "host":"192.168.1.90", "port":8084, "heartbeat":0}, {"cluster": "1","id": "3", "host":"192.168.1.90", "port":8083, "heartbeat":0}, {"cluster": "1","id": "2", "host":"192.168.1.90", "port":8082, "heartbeat":0} ] }]
This is my sample code. Please help me review code. https://github.com/challengeteamttdh/springbootgossip Thanks. My application have a shedule run each 20s. It's print number base on number of node alive and position of node alive. when It's have a node DOWN or UP. others node need to know and update rule print number for system. Thank for your support very much.
if (memberJSONObject.length() == 5
&& cluster.equals(memberJSONObject.get(GossipMember.JSON_CLUSTER))) {
This is a new piece of code. I will look at this.
I found the bug you mentioned. The startup setting code was not setting the cluster name. I am looking at the unit test there because it is suspect. Sorry for the problems. Really cool app I want to take a deeper look at it. Please try the latest trunk again. SOrry for the issues, the cluster name is a new bit and I do not use the StartupSettings code path!
I updated code it isn't occur Exeption. But when I start 2 instance of Spring Boot with port 8081 and 8082 corresponding to gossip.conf are:
- 8081 [{ "cluster":"1", "id":"1", "port":8081, "gossip_interval":1000, "cleanup_interval":10000, "members":[ {"cluster": "1","id": "4", "host":"192.168.1.90", "port":8084, "heartbeat":0}, {"cluster": "1","id": "3", "host":"192.168.1.90", "port":8083, "heartbeat":0}, {"cluster": "1","id": "2", "host":"192.168.1.90", "port":8082, "heartbeat":0} ] }]
- 8082 [{ "cluster":"2", "id":"2", "port":8082, "gossip_interval":1000, "cleanup_interval":10000, "members":[ {"cluster": "2","id": "4", "host":"192.168.1.90", "port":8084, "heartbeat":0}, {"cluster": "2","id": "3", "host":"192.168.1.90", "port":8083, "heartbeat":0}, {"cluster": "2","id": "1", "host":"192.168.1.90", "port":8081, "heartbeat":0} ] }]
We still don't know member node is UP. Firstly, I change port 8081 in application.properties and use gossip.conf for 8081 and start spring boot. Secondly, I change port 8082 in application.properties and use gossip.conf for 8081 and start spring boot. However, We do not know each other UP or DOWN. Let's take look at this. I really love your gossip code to integrate to my application. Please spend time help me resolve this issue.
Great. Keep in mind the getMemberList does not include yourself, so in a two node cluster each node has me + getMemberList() = 1
What Do You Mean ? Am I implementing incorrect? So What I need to do to fix this?
The only thing I am saying is. The member does not include the local member. The local member is assumed.
Do you have any ideal for my application?. I don't know how to apply gossip to my application. How to a instance of spring boot know to other instance of spring boot.
How do you start two compies of the application?
mvn spring-boot:run -Drun.jvmArguments='-Dserver.port=8081' mvn spring-boot:run -Drun.jvmArguments='-Dserver.port=8082'
Whehn i do this they take the same config
You need change port gossip.conf like port instance of Spring Boot.
This is gossip.conf for port 8081:
[{ "cluster":"1", "id":"1", "port":8081, "gossip_interval":1000, "cleanup_interval":10000, "members":[ {"cluster": "1","id": "4", "host":"192.168.1.90", "port":8084, "heartbeat":0}, {"cluster": "1","id": "3", "host":"192.168.1.90", "port":8083, "heartbeat":0}, {"cluster": "1","id": "2", "host":"192.168.1.90", "port":8082, "heartbeat":0} ] }]
Then run mvn spring-boot:run -Drun.jvmArguments='-Dserver.port=8081'
This is gossip.conf for port 8082:
[{ "cluster":"2", "id":"2", "port":8082, "gossip_interval":1000, "cleanup_interval":10000, "members":[ {"cluster": "2","id": "4", "host":"192.168.1.90", "port":8084, "heartbeat":0}, {"cluster": "2","id": "3", "host":"192.168.1.90", "port":8083, "heartbeat":0}, {"cluster": "2","id": "1", "host":"192.168.1.90", "port":8081, "heartbeat":0} ] }]
Then run mvn spring-boot:run -Drun.jvmArguments='-Dserver.port=8082'
is it necessary run same gossip.conf?