tempo
tempo copied to clipboard
Add capacity planning guides
Is your feature request related to a problem? Please describe. We have good SLO alerts. These are alerts are great at telling a user when things are breaking, but we'd like additional alerts to be more proactive then reactive. Let's build an alert using internal metrics that suggests when to scale ingesters.
Describe the solution you'd like Spend some time determining a good measure of the amount of work that each ingester is doing and build an alert around it. Perhaps total active traces held in memory?
Update We're looking at creating/updating a GET sizing calculator spreadsheet for capacity planning. The current spreadsheet will need to be redone to account for Parquet any way.
Customers are requesting this information. May need this for Tempo 2.0 (for ObsCon)
Given an input on spans/sec + bytes/sec, we need capacity planning guidelines that can help with cluster sizing.
Also it would be good to have a list of current metrics we can watch to determine this.
This could be part of the capacity calculator that we've had discussions about this. @annanay25 Would you be willing to update the description to match what our current work is?
This issue has been automatically marked as stale because it has not had any activity in the past 60 days. The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed after 15 days if there is no new activity. Please apply keepalive label to exempt this Issue.