How to detect saga correlation errors
We've had some issues where we mess up the saga correlation and spend too much time figuring out what the issue is and where it turned out to be a bad correlation id. It seems that if a saga message has no matching saga, the message is simply discarded silently.
Is there a way to enable logging when this happens?
It is actually logged, as you can see here: https://github.com/rebus-org/Rebus/blob/master/Rebus/Sagas/LoadSagaDataStep.cs#L160
BUT it is logged with DEBUG level, so I can definitely understand why you could miss that.
Now that you know this, do you think it will help you another time? Or do you think that Rebus should somehow provider other means for detecting this particular situation?
I have never felt the need for better or different ways of detecting this situation, but I would be very interested in hearing about your thoughts.
Personally I expected Rebus to treat it similarly to when a matching message handler is not found by logging an error message, so it didn't occur to me to enable debug logs. Thinking about it, while most of our saga messages are expected to have correlating saga data, we also have some message types where it's ok not to have correlating saga data.
I'm thinking it might be nice if Rebus supported specifying how to handle missing saga data, maybe a setting per saga or per message type. But I'm also thinking message correlation is a kind of thing that, once it works, it just works - until we make that once in a blue moon configuration change that breaks it. In our recent case we changed both saga backend and message bus, where the new saga backend serialization causes correlation to fail. In hindsight it's obvious we should have enabled debug logging
Rebus 7.0.0-rc4 (which is on NuGet.org now) has a pluggable ICorrelationErrorHandler which makes it possible to customize Rebus' behavior.
Check out TestCorrelationBehavior.cs for an example on how to plug your own - e.g. to treat a correlation miss as an actual error, simply throw an exception in your implementation.
I hope this turns out to be useful for you 🙂