scylla-manager icon indicating copy to clipboard operation
scylla-manager copied to clipboard

Manager has too many metrics

Open amnonh opened this issue 11 months ago • 7 comments

Taken from a cluster with 18 nodes and 540 cores: image

This amount of metrics is not useful. It adds too much load with little gain; by default, the manager number of metrics should be proportional to not more than the number of nodes.

amnonh avatar Feb 26 '24 13:02 amnonh

@amnonh what is the source of so many metrics? Is it scale by the number of tasks, cores, others?

tzach avatar Feb 26 '24 13:02 tzach

All SM backup metrics visible in this picture are labeled by:

  • cluster ID
  • keyspace
  • table
  • host

This would mean that there is about 2381 tables in mentioned cluster. Is that the case?

Michal-Leszczynski avatar Feb 26 '24 18:02 Michal-Leszczynski

@amnonh @tzach @vladzcloudius @karol-kokoszka so what level of granularity would be ok? Note that sctool progress also returns per host/keyspace/table progress, so maybe it's ok to decrease metric granularity.

Knowing that both backup (per host) and repair (in general) work table by table, maybe it would be ok to get rid of host label in those metrics?

Michal-Leszczynski avatar Mar 01 '24 10:03 Michal-Leszczynski

ping @amnonh , @tzach - what's the verdict here? Let's improve the situation for 3.2.7

mykaul avatar Mar 11 '24 07:03 mykaul

Think about the situation of having a thousand tables on a 60-node cluster; we want to show the repair/backup progress status and pause. Can we do it with ten metrics or less? A hundred? If the number of tasks is limited, having it per task is fine. But we must remember that we don't show tasks and tables per user. This level of granularity could be a table in Scylla or a log.

amnonh avatar Mar 11 '24 09:03 amnonh

The agreement is to disable manager per table and node metrics by default. In other words, only cluster-level metrics by default.

tzach avatar Mar 11 '24 10:03 tzach

Optimistically setting this to 3.2.7 - if we miss it, that's OK.

mykaul avatar Mar 13 '24 12:03 mykaul