cassandra-diagnostics
cassandra-diagnostics copied to clipboard
Cassandra Node Diagnostics Tools
Using datadog client library messes up complex measurements. When reporting more than 10 measurements all kids of funny shit happens.
We reverted to measurement per metrics because of datadog reporter issues. Switch to complex metrics measurements but group by package (ThreadPools, Compactions, ClientRequests, ....)
Sometimes a diagnostics module will not be initialized on startup. It might be a good idea to have some kind of heartbeat for modules in order to track if all...
``` ERROR [metrics-timer] 2017-04-04 08:19:29,732 MetricsCollector.java:133 - Exception while reading attribute Value of type java.lang.Object javax.management.InstanceNotFoundException: org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=CounterMutationStage,name=ActiveTasks at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095) ~[na:1.8.0-zing_16.12.3.0] ... at io.smartcat.cassandra.diagnostics.module.metrics.MetricsCollector.collectMeasurements(MetricsCollector.java:124) ~[cassandra-diagnostics-core-1.4.0.jar:na] at io.smartcat.cassandra.diagnostics.module.metrics.MetricsModule$MetricsTask.run(MetricsModule.java:66) [cassandra-diagnostics-core-1.4.0.jar:na] at java.util.TimerThread.mainLoop(Timer.java:555) [na:1.8.0-zing_16.12.3.0]...
A way to dynamically turn on/off connector tracing needs to be implemented. Having that exposed through JMX would render unnecessary to reload the entire diagnostics configuration.
Use approach implemented in [reaper](https://github.com/spotify/cassandra-reaper). The idea is to have a repair running on each node.