Question about ZMQ_HEARTBEAT_* options
Issue description
In my case, the subscriber can be waiting for the publisher for a long time(maybe couple of minutes or 30min), during this period the publisher can be not online. After a while the publisher started to publish message, but i found that the subscriber was still "hang" or dead, it received nothing.
I keep following values as default.
tcp_keepalive_time 7200
tcp_keepalive_intvl 75
tcp_keepalive_probes 9
My previous solution was using zmq_poll timeout and then reconnect the socket. But I found this method is not elegant.
Because of some reason my subscriber part must use zmq_bind but not zmq_connect, in this case, i found that after a while the port can not "bind" any more, it shows, "[ZMQ Subscriber]: Bind failed reason: Address already in use"
Q1:
Why the socket can not "bind" any more? Even though i have already set LINGER and set sleep time before reconnect?
it sounds like a problem of the underlying system?
Pseudo code:
int reconnect(){
zmq_close(m_receiver);
// create the socket again
int linger = 0;
m_receiver = zmq_socket(m_context, ZMQ_SUB);
zmq_setsockopt (m_receiver, ZMQ_LINGER, &linger, sizeof(linger)); // important!!!
if (zmq_bind(m_receiver, m_endpoint.c_str()) < 0){ // This is zmq_bind not zmq_connect
return -1;
}
return 1;
}
int subscribe(){
...
int poll_flag = zmq_poll (items, 1, timeout);
if (0 == poll_flag){ // timeout reconnect the socket !
reconnect()
sleep(200ms)
}
...
}
I already read issues https://github.com/zeromq/libzmq/issues/2763, https://github.com/ignitionrobotics/ros_ign/issues/42, https://github.com/zeromq/pyzmq/issues/1503
So im curious about ZMQ_HEARTBEAT_* options.
Q2: If ZMQ_HEARTBEAT_TIMEOUT is set e.g. 5s, the connection would be closed after 5s? or the connection would be Re-established automatically, which means ZMQ_HEARTBEAT_* options would keep socket alive all the time? If the connection is closed after 5s, i need to reconnect the socket right? (which means close socket firstly and then bind the socket again)
Environment
libzmq version (commit hash if unreleased): 4.3 OS: Docker in ubuntu18.04 or RK3399 chip (linaro)
Q3: What is the difference between zmq tcp keepalive options and heartbeat options? just TCP layer and ZMQ protocol layer?
E.g. If I use following values, the connection would be reconnected after 1min automatically, right?
int v1 = 1;
int v2 = 9;
int v3 = 60;
int v4 = 1;
zmq_setsockopt(socket, ZMQ_TCP_KEEPALIVE, &v1, sizeof(v1));
zmq_setsockopt(socket, ZMQ_TCP_KEEPALIVE_CNT, &v2, sizeof(v2));
zmq_setsockopt(socket, ZMQ_TCP_KEEPALIVE_IDLE, &v3, sizeof(v3));
zmq_setsockopt(socket, ZMQ_TCP_KEEPALIVE_INTVL, &v4, sizeof(v4));
+1, very confused about these options.
Did you solve it? @EgalYue