autoscaling
autoscaling copied to clipboard
agent: Support fetching LFC metrics (but don't use them yet)
Probably best to review each commit separately.
Most of the commits here are just repositioning to make the final commit, which adds LFC metrics collection, as simple as possible.
~~Builds on #892, must not be merged before then.~~
Discussion from the 1-1 meeting: scraping sql_exporter from the autoscaling-agent is the right way to go. I'll add a separate sql_exporter solely for autoscaling metrics. For now, we can just scrape the current one with a reasonable interval.
Discussion with @Omrigan : It's fine to just hard-code the metric names, and that makes it a lot simpler. Some discussion about whether to encapsulate the callbacks for metrics collection into a single struct with func(...) fields or an interface.
New changes:
- Move metric names into the code (rather than config) and remove the generics it required. Diff here:
4b06d09..0bb2b59. - Fixed a bug where we'd always use the "system" metrics port, even for LFC metrics: da48a4cd3f89ef8eaceaf620e7e6faff781aef77 (NOTE: this was from reading the code; I haven't yet tested it)
- Switch from unnamed callbacks to a new
metricsMgr[M]struct to contain them and thekindstring: 5dc98a95675fcb6573278834a688b123eda89149 - Switch from the
emptyMetrics()constructor callback to a generics-based approach: 031cc2d13a74e4c232122053404b6119cae04f47- cc @Omrigan particularly want your feedback on this one. I could go either way on this -- left it as a separate commit so it's easy to revert.
- Deduplicate metrics config parsing: 86cb9f4a277a5357f95920a82d25b074eef3e60a + c1ae8d3b1ed5c2badcf0d3873b939b6be70137a9
edit: And the first commit has been spun off into a separate PR: #948.
Continuing with the changelog approach, if only because it helps me manage the rebasing:
- Reverted 031cc2d13a74e4c232122053404b6119cae04f47 ("tmp: Switch from emptyMetrics() callback to generics"): 60f835b5fc21bc867d8b3a466f1896a22d44f6dc
- Squashed everything back down:
60f835b..666e11f - Rebased onto main: 666e11f02bae311c0ac57dbe129e42f1487262db -> 34afd60dc06f5db3551ae1893054716a65cf5b2e
- Renamed
newMetrics->updateMetrics: cb14dafe0dcefdb61bec02e365c9626af81b2cfa+b9e3ebd1ea51a7d3730b965d3ecd77c0318776f4 - Fix outdated comment on
FromPrometheus: 662613d170363fa9cf240e5a0225e0c2f1c09674 - Renamed the
c/lcontext/logger variables toctx2/logger2: efdda7c819bbfe1a5178b884d9803438b5cebba1 - Remove 5-minute load average from the system metrics names: 589f6b7485a97732d1297610338f7e5b6e1a0833
- Rework system metrics string constants: 66b92893a804fc0f17b84f7c67358d9610573cb1
Plus a couple more changes to #948 based on review here.
Final changes:
- Squash previous commits. Diff here (empty):
66b9289..20bd2ba - Rebase onto main (dropping #948 from this branch). Diff here:
20bd2ba..fb2f910 - Fix compile error from rebase: 689e18d0d66b935f9c0ffba6cae5df2e4f8f8306
- Change
metricsMgr[M any]->metricsMgr[M core.FromPrometheus]: d0991a9d2447ff1c2a37e0cf2241422df51dea52
Waiting on some issues with latest release. Will hold off on merging for now to avoid accidentally adding much larger changes alongside a small fix.