Terminate jmxtrans when it runs into an exception
Address one common cause of uncaught exceptions, and also register an uncaught exception handler to cause jmxtrans to exit with a non-zero status rather than getting stuck in an inconsistent state.
Fixes #685.
:x: Build jmxtrans 265 failed (commit https://github.com/jmxtrans/jmxtrans/commit/b18b51f389 by @afn)
Looks good (minor comments inline). The travis failure seems like a transient (we have a /few/ tests that need /some/ cleaning).
:+1: I've rebased this against ignore-hidden-files so this should merge without conflicts after #710 is merged.
Thanks for the quick review! Once #709, #710, and #712 have been merged, would you mind putting out a new release?
Thanks for the quick review! Once #709, #710, and #712 have been merged, would you mind putting out a new release?
Yep, I can do that. Ping me if I forget!
Is this good to merge, or anything else that needs to be addressed?
This is an important PR, when will it be merged?
Yea, we also have some issues with this problem in our infrastructure. Please make this part of the next release!
This PR is very close to https://github.com/jmxtrans/jmxtrans/pull/731 Does it fix the same problem?
Please merge this! We're running jmxtrans as a sidecar to our kafka & mirrormaker pods, but they often end up in "zombie mode" after exceptions like mentioned in this thread. I had to work around it by creating a liveness probe which monitors outgoing network traffic to detect when it stopped writing to our backend (influx). Messy at best!
Is there anything that can be done to help get this over the finish line? We're running into this issue as well.