opentelemetry-go
opentelemetry-go copied to clipboard
Metrics memory leak v1.12.0/v0.35.0 and up
Description
After upgrading from v1.11.2/v.0.34.0 to v1.12.0/v0.35.0 I have witnessed our pods to have a never-ending increase in memory consumption over time. Restoring to the former version brought stable memory use back. Once v1.13.0/v0.36.0 had been released I tried upgrading to that version, but got the same result. Once again, restoring to v1.11.2/v.0.34.0 stopped the memory increase.
Refer to the screenshot below that shows memory use of one of our pods on AWS EKS, specifically:
- On 1/31/23 I upgraded to v1.12.0/v0.35.0 and instantly you can see the memory grow.
- On 2/6/23 I made an unrelated release, which reset memory use but it immediately started growing again.
- On 2/8/23 I downgraded to v1.11.2/v.0.34.0, after which memory use stayed stable again.
- On 2/14/23 I upgraded to v1.13.0/v0.36.0, and once again saw memory increase. A few new releases made afterwards temporarily reset memory use, as before.
- On 2/21/23 I downgraded v1.11.2/v.0.34.0, restoring the memory use.
Environment
- OS: Linux
- Architecture: AWS EKS
- Go Version: 1.19
- opentelemetry-go version: v1.12.0/v0.35.0 and up
Steps To Reproduce
- Upgrade to v1.12.0/v0.35.0 or higher.
- Update metrics from
syncint64
etc. to their correspondinginstrument
versions. - See memory grow over time.
Expected behavior
Memory stays stable.
What telemetry are you generating over these time periods? What instruments are you using and what attributes are you using?
Every 5s a health/live check is performed, which uses an int64 counter (+1) and an int64 histogram (+duration). Each function call does the same, with differently named counters/histograms.
The non-health/live check functions add a few constant attributes, such as package name, but those functions are not currently called frequently - perhaps once a day. The health/live check function, which is the one contributing 99.9% of metrics, adds just the request path (either /health
or /live
) as attribute.
Are you able to obtain and share pprof heap dumps from the process when memory use is elevated?
I'll need a few days to find some time to enable pprof, update the package version, and capture the heap dumps - stay tuned.
Over the course of some 6 hours I took 5 heap dumps using pprof, see attachment. In those 6 hours only the live and health checks were called every 5 seconds, which resulted in ~4 counters and ~histograms being updated with constant attributes (namely, a string containing the suffix of the URL - either /livez or /healthz - and a boolean whether the call was a success - which it always was).
There are some functions worth looking at it seems: prometheus.MakeLabelPairs and attribute.computeDistinctFixed.
Image of the above data using go tool pprof
:
I deployed a fresh instance and will let it run until Monday so I can collect a new sample to give you even more to work with. If there are other pprof extracts you're interested in let me know, and I'll see what I can do.
If you could get data with -inuse_objects
as well that could be helpful.
My initial thoughts
- We should look into if we can provide a reusable
Sortable
during the attribute set collection - We might need to double check we are correctly creating and recording prometheus data efficiently
- The old SDK ability to record data with for a batch of attributes might need to be re-added (this is something Java kept)
I'm interested to see what the inuse_objects
returns. That should help determine what things are allocated.
I called the tool with inuse_objects
followed by top
to get the highest memory consumers, and got the following:
go tool pprof --inuse_objects https://[host]:[port]/debug/pprof/heap
Fetching profile over HTTP from https://[host]:[port]/debug/pprof/heap
/pprof.main_image.binary.alloc_objects.alloc_space.inuse_objects.inuse_space.001.pb.gz
File: main_image.binary
Type: inuse_objects
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 704254, 99.21% of 709865 total
Dropped 46 nodes (cum <= 3549)
Showing top 10 nodes out of 47
flat flat% sum% cum cum%
496994 70.01% 70.01% 496994 70.01% github.com/prometheus/client_golang/prometheus.MakeLabelPairs
76459 10.77% 80.78% 76459 10.77% strings.(*Builder).grow
32770 4.62% 85.40% 32770 4.62% go.opentelemetry.io/otel/sdk/metric/internal.newBuckets
32768 4.62% 90.02% 32768 4.62% strconv.formatBits
24588 3.46% 93.48% 24588 3.46% go.opentelemetry.io/otel/attribute.computeDistinctFixed
14044 1.98% 95.46% 14044 1.98% github.com/prometheus/client_golang/prometheus.processMetric
10923 1.54% 97.00% 10923 1.54% github.com/prometheus/client_golang/prometheus.populateMetric
6150 0.87% 97.86% 6150 0.87% runtime.allocm
5461 0.77% 98.63% 5461 0.77% runtime.main
4097 0.58% 99.21% 4097 0.58% go.opentelemetry.io/otel/sdk/metric/internal.(*cumulativeHistogram[...]).Aggregation
(pprof)
Another update:
Showing nodes accounting for 1653226, 98.38% of 1680435 total
Dropped 54 nodes (cum <= 8402)
Showing top 10 nodes out of 45
flat flat% sum% cum cum%
988529 58.83% 58.83% 988529 58.83% github.com/prometheus/client_golang/prometheus.MakeLabelPairs
316763 18.85% 77.68% 316763 18.85% strings.(*Builder).grow
73730 4.39% 82.06% 73730 4.39% github.com/prometheus/client_golang/prometheus.(*constHistogram).Write
65537 3.90% 85.96% 65537 3.90% net/textproto.(*Reader).ReadMIMEHeader
65536 3.90% 89.86% 65536 3.90% strconv.formatBits
45059 2.68% 92.54% 45059 2.68% go.opentelemetry.io/otel/sdk/metric/internal.newBuckets
41892 2.49% 95.04% 41892 2.49% go.opentelemetry.io/otel/attribute.computeDistinctFixed
23412 1.39% 96.43% 97145 5.78% github.com/prometheus/client_golang/prometheus.processMetric
21845 1.30% 97.73% 21845 1.30% sort.Strings
10923 0.65% 98.38% 10923 0.65% github.com/prometheus/client_golang/prometheus.populateMetric
This seems to be indicating that the memory use is coming from the Prometheus client used by the Prometheus exporter. I'm not sure how much we can do to alleviate that given we do not control that code.
We should double check we are supporting as best we can the optimal path of label creation.
@dashpole is there anything that seems obvious to you?
Are you able to provide the contents of curling the prometheus endpoint? Reading the implementation, I don't see anything that would obviously be the culprit for this.
I think we're getting close to the solution. Curling the /metrics
endpoint gave me "http_server" count and histogram metrics repeated thousands upon thousands of times (until I quit the command). See the attached text file where I grabbed the first 1000 records.
The main difference between each repeated block of the "http_server" metrics is in the attributes: The net _sock_peer_port
keeps increasing. These metrics are not created by my app - they seem to be built-in.
This looks related to https://github.com/open-telemetry/opentelemetry-go/issues/3744
The peer port should not be on a server metric^1.
What is your current suggestion in my situation? Hold off on updating until #3744 is fixed, or do I need to change something on my end?
You could create a view for the instrument that filters out the attribute. Otherwise, yeah, I think the fix to #3744 is in order to fully remove the high cardinality attribute.
@bthomee Would it be possible for you to provide setup code you use in your service which creates exporter and meter provider?
The following extract shows how we set up the exporter and meter provider:
// Create the metric exporter.
logger.Info("Creating metric exporter.")
metricExporter, err := prometheus.New()
if err != nil {
logger.Panics(err)
}
// Create a view that filters out a number of dynamic attributes.
filteredView := metric.NewView(
metric.Instrument{Name: "http.server.*"},
metric.Stream{
AttributeFilter: HttpAttributeFilter,
},
)
// Create a metric provider using the exporter and filtered view.
metricProvider := metric.NewMeterProvider(
metric.WithReader(metricExporter),
metric.WithView(filteredView),
)
// Register the meter provider, so elsewhere we can call `global.MeterProvider()` to access it.
global.SetMeterProvider(metricProvider)
// Wrap the registry by a handler, so that collected metrics can be exported via HTTP.
metricHandler := promhttp.Handler()
The new addition we made, based on @MrAlias' suggestion, is adding that filtered view, which has worked like a charm.
Was this resolved in 1.17.0?