libzmq
libzmq copied to clipboard
Occasional crash in `KERNELBASE.dll`
Issue description
I am currently using CPPZMQ version 4.8.1 and libzmq 4.3.4. The folks at CPPZMQ suggested this would be a better support channel.
If there is a better support channel, please let me know.
My application is running on Windows Server 2016. The latest windows updates and drivers are installed. I have run system file checker and it reports no issues.
My application occasionally crashes when attempting to read from the socket. The specific message is:
when calling
int zmq_msg_recv(zmq_msg_t *msg_, void *s_, int flags_)
_flags_
is zmq::recv_flags::none
.
Call stack:
Last source code in the call stack I have (I highlighted the offending line with >>>
):
ZMQ_NODISCARD
recv_result_t recv(message_t &msg, recv_flags flags = recv_flags::none)
{
>>> const int nbytes =
zmq_msg_recv(msg.handle(), _handle, static_cast<int>(flags));
if (nbytes >= 0) {
assert(msg.size() == static_cast<size_t>(nbytes));
return static_cast<size_t>(nbytes);
}
if (zmq_errno() == EAGAIN)
return {};
throw error_t();
}
void listen()
{
while (true)
{
if (!zSocket)
{
return;
}
try
{
chat_message_t message;
>>> if (!zSocket->recv(message.type, zmq::recv_flags::none))
{
send_queue();
continue;
}
int more = zSocket->get(zmq::sockopt::rcvmore);
if (more)
{
std::ignore = zSocket->recv(message.data);
more = zSocket->get(zmq::sockopt::rcvmore);
if (more)
{
std::ignore = zSocket->recv(message.packet);
}
}
parse(message);
}
catch (zmq::error_t& e)
{
// Context was terminated (ETERM = 156384765)
// Exit loop
if (!zSocket || e.num() == 156384765)
{
return;
}
ShowError("Message: %s\n", e.what());
continue;
}
}
}
Only one thread in the process creates and interacts with the zSocket
. However, there is a companion process (an entirely separate application) that also has its own zSocket
.
I'm not sure what steps I can take to tackle this problem. Our server has 20-70 simultaneous users (only three sessions are permitted from the same IP address). ZMQ doesn't really provide any method for tracking IP address on individual sockets, and it crashes while reading from the socket so I can't check what about the message might have caused it to crash. I'm not sure what other methods I can take to track/log this issue and attempt to develop a pattern or find a culprit, and we haven't noticed an pattern in user behavior that might be triggering it.
We get this crash on average 1-2 times a day. We were affected by this exact same crash about five months ago, but it seemed to resolve itself for about 3 months before it cropped up again early last month. It is highly intermittent, but frequent enough to be severely disruptive. Server chugs along quite happily for ~12 hours and then crashes. There doesn't seem to be any specific pattern to the timing of the crash.
Environment
- libzmq version: 4.3.4
- OS: Windows Server 2016 VM running on a Windows Server 2016 machine.
Minimal test code / Steps to reproduce the issue
The crash is highly intermittent. I have no specific steps for reproducing it, beyond running my live server with its real user base until the crash happens.
I am unable to reproduce the issue in a controlled environment.
Running the live application in debug mode is not a viable option.
What's the actual result? (include assertion message & call stack if applicable)
See above
What's the expected result?
Does not intermittently crash.
I'm looking for advice on how to better approach this issue.
Not sure this is still current but you should consider building the library with symbols so the stack trace makes sense.