go-zero icon indicating copy to clipboard operation
go-zero copied to clipboard

rpc not discovered after ETCD down and restarted

Open Lewis-liuwei opened this issue 2 years ago • 4 comments

etcd崩溃的原因有可能是因为服务压力大造成的 重启之后,发现已经启动的rpc服务无法自动在etcd注册 然后手动重新启动所有rpc服务就可以了

Lewis-liuwei avatar May 23 '22 01:05 Lewis-liuwei

I tried many times, didn't get it reproduced.

What's the steps to reproduce it?

kevwan avatar Jul 22 '22 12:07 kevwan

我们这边也出现了这种情况,复现步骤如下:

  1. 开启一个服务,正常注册到etcd 中,观察etcd的节点 etcdctl get xxx_service --prefix=true,可以观察到注册成功
  2. 切掉服务和etcd的网络,然后观察 注册的数据还在,但是租约到期之后节点就没了
  3. 恢复网络,这时候是没法再次注册上去的,这样的话consumer会一直找不到注册的节点,这种情况下只能手动去重启服务,才能再次注册上去 @kevwan

laowenyi avatar Oct 14 '22 06:10 laowenyi

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


We also have this situation here, and the steps to reproduce are as follows:

  1. Start a service, register with etcd normally, observe etcd's node etcdctl get xxx_service --prefix=true, you can observe that the registration is successful
  2. Cut off the network of services and etcd, and then observe that the registered data is still there, but the node is gone after the lease expires
  3. To restore the network, it is impossible to register again at this time. In this case, the consumer will never find the registered node. In this case, you can only manually restart the service to register again @kevwan

Issues-translate-bot avatar Oct 14 '22 06:10 Issues-translate-bot

I have the same problem,follow the steps:

  1. I have host A and host B
  2. host A run etcd and rpc service
  3. host B run the same rpc service
  4. restart etcd
  5. run etcdctl get --prefix "" the rpc service on host A have registed,but the one on host B not

geata avatar Aug 02 '23 11:08 geata