netmq
netmq copied to clipboard
Mitigating a FaultException in Mechanism.Encode.
On some circumstances Msg class throws NetMQ.FaultException in Mechanism.Encode method. The details are in https://github.com/zeromq/netmq/issues/1094
closes #1094
Any chance you could have a look at this PR, @drewnoakes? I know I am not entitled to nag or demand from open-source maintainers, but it would be great to get some updates on the NetMQ core. Right now, I am maintaining my own fork and building from source rather than using the NuGet package to get some of these fixes included.
While this does look like it'd suppress the exception, I'm not sure this is an actual fix. It might just allow the process to continue having skipped data or in some other invalid state, whereafter debugging the problem would probably be harder.
From the linked issue:
I think on some circumstances the code in NetMQ.Core.Utils.YQueue can return a non-initialized message.
Indeed, looking at YQueue you can see it's not null-annotated, and there are a bunch of expectations around how the type is used. I wonder whether you'd be able to run a version of NetMQ with an implementation of YQueue that validates all nullness guarantees, to see if that's what's really going on. A process dump when the exception is thrown should provide insight into what state the application was in when the failure occurred.
https://stackoverflow.com/a/20238046/24874
A dump can be opened in Visual Studio to analyze the state of the process at the time of the crash. The instance of YQueue, YPipe and so on can be inspected to check if it's in a bad state.
I'm sympathetic to the problem here and want to find a solution that addresses the problem fully. It's just that the problem isn't well understood unfortunately.
While this does look like it'd suppress the exception, I'm not sure this is an actual fix. It might just allow the process to continue having skipped data or in some other invalid state, whereafter debugging the problem would probably be harder.
From the linked issue:
I think on some circumstances the code in NetMQ.Core.Utils.YQueue can return a non-initialized message.
Indeed, looking at
YQueueyou can see it's not null-annotated, and there are a bunch of expectations around how the type is used. I wonder whether you'd be able to run a version of NetMQ with an implementation ofYQueuethat validates all nullness guarantees, to see if that's what's really going on. A process dump when the exception is thrown should provide insight into what state the application was in when the failure occurred.https://stackoverflow.com/a/20238046/24874
A dump can be opened in Visual Studio to analyze the state of the process at the time of the crash. The instance of
YQueue,YPipeand so on can be inspected to check if it's in a bad state.I'm sympathetic to the problem here and want to find a solution that addresses the problem fully. It's just that the problem isn't well understood unfortunately.
Hello @drewnoakes I am going to collect the dump data on the program crash. But it is a windows service. Will the method with Windows Error Reporting by the link you posted https://stackoverflow.com/questions/20237201/best-way-to-have-crash-dumps-generated-when-processes-crash/20238046#20238046 work? I mean it is said in the post that Windows will collect the data only after you click "Close the program" in an error message that reports the program crash.
Hi @drewnoakes. The issue stops reproducing. I failed to get the process dump on the program crash. I am closing this issue for now.