go.d.plugin
go.d.plugin copied to clipboard
Cassandra module
This PR implements what was defined for issue https://github.com/netdata/netdata/issues/13700
@thiagoftsm I see you are collecting only summary metrics, why?
@thiagoftsm I see you are collecting only summary metrics, why?
To simplify transition from users that are using other software to netdata. Other companies are also collecting summary;
To simplify transition from users that are using other software to netdata. Other companies are also collecting summary;
I very doubt that. I am pretty sure they do collect not only summary metrics.
Other companies are also collecting summary;
Can you give a link to those company's collector source code?
cc @shyamvalsan
@ilyam8 the actual description https://github.com/netdata/netdata/issues/13700 conduct us to the summary metrics
@thiagoftsm you get summary metrics on the Cloud overview page.
@thiagoftsm my point is that we would need to delete the summary metrics if we add per-instance metrics, so why add the summary in the first place?
@thiagoftsm @ilyam8 I am not sure I follow this discussion - what is the concern here? what exactly do you mean by "summary" metrics in this context?
Let's take for example this metric
org_apache_cassandra_metrics_table_count
# HELP org_apache_cassandra_metrics_table_count Attribute exposed for management org.apache.cassandra.metrics:name=CompactionBytesWritten,type=Table,attribute=Count
# TYPE org_apache_cassandra_metrics_table_count untyped
org_apache_cassandra_metrics_table_count{keyspace="system_traces",scope="events",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="view_builds_in_progress",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_distributed",scope="parent_repair_history",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_distributed",scope="repair_history",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="available_ranges",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="built_views",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="indexes",name="LiveDiskSpaceUsed",} 17084.0
org_apache_cassandra_metrics_table_count{keyspace="system_auth",scope="role_members",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="available_ranges_v2",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_distributed",scope="repair_history",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_auth",scope="role_permissions",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="tables",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="repairs",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="batches",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_auth",scope="role_members",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_traces",scope="events",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="paxos",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_auth",scope="role_permissions",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="peer_events_v2",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="view_builds_in_progress",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="types",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_auth",scope="role_members",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="available_ranges_v2",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="transferred_ranges_v2",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="peer_events",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_traces",scope="sessions",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="views",name="TotalDiskSpaceUsed",} 16961.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="prepared_statements",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="paxos",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="table_estimates",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="columns",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="triggers",name="TotalDiskSpaceUsed",} 17084.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="functions",name="TotalDiskSpaceUsed",} 17342.0
org_apache_cassandra_metrics_table_count{keyspace="system_distributed",scope="parent_repair_history",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="IndexInfo",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="indexes",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="peer_events_v2",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="prepared_statements",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_auth",scope="network_permissions",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="local",name="LiveDiskSpaceUsed",} 8643.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="size_estimates",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="batches",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="transferred_ranges",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="functions",name="LiveDiskSpaceUsed",} 17342.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="peers",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="tables",name="LiveDiskSpaceUsed",} 29641.0
org_apache_cassandra_metrics_table_count{keyspace="system_auth",scope="role_permissions",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="triggers",name="LiveDiskSpaceUsed",} 17084.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="repairs",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="functions",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_distributed",scope="view_build_status",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="IndexInfo",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="indexes",name="TotalDiskSpaceUsed",} 17084.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="paxos",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="repairs",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="sstable_activity",name="TotalDiskSpaceUsed",} 13230.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="transferred_ranges",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_traces",scope="events",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="peer_events",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="peers_v2",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="table_estimates",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_auth",scope="resource_role_permissons_index",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="size_estimates",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_distributed",scope="view_build_status",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_distributed",scope="parent_repair_history",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="peer_events",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="peer_events_v2",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="transferred_ranges",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_auth",scope="resource_role_permissons_index",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="aggregates",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="prepared_statements",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="columns",name="TotalDiskSpaceUsed",} 35012.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="peers_v2",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="transferred_ranges_v2",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_distributed",scope="repair_history",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="batches",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="available_ranges",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="built_views",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="table_estimates",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="peers",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_distributed",scope="view_build_status",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="views",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="peers_v2",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="types",name="LiveDiskSpaceUsed",} 16961.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="local",name="CompactionBytesWritten",} 679.0
org_apache_cassandra_metrics_table_count{keyspace="system_auth",scope="roles",name="LiveDiskSpaceUsed",} 5181.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="tables",name="TotalDiskSpaceUsed",} 29641.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="size_estimates",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="local",name="TotalDiskSpaceUsed",} 8643.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="sstable_activity",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="available_ranges_v2",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_auth",scope="resource_role_permissons_index",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="view_builds_in_progress",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="keyspaces",name="TotalDiskSpaceUsed",} 17979.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="keyspaces",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="sstable_activity",name="LiveDiskSpaceUsed",} 13230.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="IndexInfo",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="dropped_columns",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="aggregates",name="TotalDiskSpaceUsed",} 17342.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="columns",name="LiveDiskSpaceUsed",} 35012.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="built_views",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="peers",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_traces",scope="sessions",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="dropped_columns",name="TotalDiskSpaceUsed",} 17822.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="keyspaces",name="LiveDiskSpaceUsed",} 17979.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="compaction_history",name="LiveDiskSpaceUsed",} 27428.0
org_apache_cassandra_metrics_table_count{keyspace="system_auth",scope="roles",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="types",name="TotalDiskSpaceUsed",} 16961.0
org_apache_cassandra_metrics_table_count{keyspace="system_auth",scope="network_permissions",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="transferred_ranges_v2",name="TotalDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_auth",scope="roles",name="TotalDiskSpaceUsed",} 5181.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="aggregates",name="LiveDiskSpaceUsed",} 17342.0
org_apache_cassandra_metrics_table_count{keyspace="system_traces",scope="sessions",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="compaction_history",name="CompactionBytesWritten",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="compaction_history",name="TotalDiskSpaceUsed",} 27428.0
org_apache_cassandra_metrics_table_count{keyspace="system",scope="available_ranges",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_auth",scope="network_permissions",name="LiveDiskSpaceUsed",} 0.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="dropped_columns",name="LiveDiskSpaceUsed",} 17822.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="views",name="LiveDiskSpaceUsed",} 16961.0
org_apache_cassandra_metrics_table_count{keyspace="system_schema",scope="triggers",name="CompactionBytesWritten",} 0.0
Instead of creating a chart per keyspace, and scope (with keypsace, scope labels) we create one chart with summary metrics (that would be provided by ND Cloud overview functionality).
OK understood @ilyam8
But would creating charts per keyspace and per scope lead us into the same problem we see on PostgreSQL when there are thousands of tables and indexes? Personally, if there's a way we can capture more info and not suffer performance implications I am all for it.
I think the number of keyspaces is only limited by available memory - but based on the feedbackfrom the user & community the recommended number of keyspaces seems to be <200 or even 150. No idea for scope though.
Also @thiagoftsm there are potentially a LOT of metrics we can collect here, but I think to start with we should limit ourselves to what is mentioned in https://github.com/netdata/netdata/issues/13700 if there is user demand for more we can address it later. The important thing is to get the collector with basic metrics available out to the users as soon as we can.
ah, if collecting only summary is what you think will do now, then ok.
Now that we agreed the metrics, I fixed the issues we had with last commits.
@thiagoftsm I am going to merge the PR and add adjustments (if needed) after.
And for some reason, you haven't added any readme 🤷♂️
Thank you for your help @ilyam8 ! :)
@thiagoftsm I think all the metrics - except one (exceptions - requests for which Cassandra encountered an error) have been added.
Some of the charts need to be organized into sections, but that should be a minor PR.
Let's have a chat about this today.
Hello @shyamvalsan ,
I will take a look, but I remember this Exception I could not find it in documentation, do you remember the source of this information?
Which sections? Are you talking about the same sections in the issue? If yes, we only need to organize families, but this will create a third level organization in our dashboard.
Best regards!
@thiagoftsm
Exceptions = StorageExceptions I believe. It is the number of internal exceptions caught.
By sections I mean following the organization mentioned in https://github.com/netdata/netdata/issues/13700 the throughput, latency and cache sections are good. But the rest of the charts need to be clubbed together under the following sections: Disk usage, Garbage collection, Errors as shown below. This does NOT require a third level.
- Disk usage
- Load (Disk space used on a node in bytes)
- Total disk space used (Disk space used by column family, in btyes)
- Compaction tasks completed (Total count of completed compaction tasks)
- Compaction tasks in queue (Total count of pending compaction tasks in queue)
- Garbage collection
- ParNew count (Number of young-generation collection)
- ParNew time (Elpased time of young-generation collection in milliseconds)
- ConcurrentMarkSweep count (Number of old-generation collection)
- ConcurrentMarkSweep time (Elapsed time of old-generation collection in milliseconds)
- Errors
- Exceptions (Requests for which Cassandra encountered an error)
- Timeout exceptions (Requests not unacknowledged within timeout window)
- Unavailable exceptions (Requests for which required number of nodes was unavailable)
- Pending tasks (Tasks in queue awaiting a thread for processing)
- Blocked tasks (Tasks that have not yet ben queued for processing)
btw it is Cassandra not cassandra 😄

All right, I am working with Ilya to address everything. :)