specification
specification copied to clipboard
Supporting Linux Perf events with Opentracing
This issue is to discuss a proposal on adding perf events support within a span. Perf events (https://perf.wiki.kernel.org/index.php/Tutorial#Events) are platform side counters which can provide valuable insights on what is happening inside a span and can help us in identifying resource bottlenecks.
We would like to hear your thoughts and suggestions on the same.
We did a small proof-of-concept implementation with opentracing-go and zipkin-go-opentracing to show how the eventual integration may look.
Opentracing-go api will have a PerfString event as an addition to the span creation function. And, the backend Zipkin collector will be responsible for starting, reading, logging and closing the required counter.
To try out the opentracing-go with perfevents, this commit can be used : https://github.com/platform-tracing/opentracing-go/commit/9391ff11ee1865a7727f94d0113b734d1f454dca And, to try out zipkin with perfevents, this commit can be used : https://github.com/platform-tracing/zipkin-go-opentracing/commit/1f9fb2afcfe3412edc30fd4fa22705b495d64403
A "perfevents" library has been added to zipkin-go-opentracing which assists in creating, maintaining and reading the perf events like cpu-cycles, instructions, cache-misses etc. An example usage in a sample application: var eventName opentracing.PerfString eventName = "cpu-cycles" parent := tracer.StartSpan("Parent", eventName)
And, we can see the output as shared in the attachment.

This is fun stuff, thank you for bringing it up.
@hkshaw1990 I think the crux of this is finding a way to record only the performance data that's relevant when the associated Span is "active" or in the foreground. This reminds me of the https://github.com/opentracing-contrib/java-spanmanager work that @sjoerdtalsma did... the tricky thing is that a Span is not necessarily "active" just because it hasn't been finish()ed yet. Have you thought through this aspect of things yet?
Hi @bensigelman, thanks for referring to this point. I agree, that a span is not necessarily active even if it hasn't finished. But, the way the perf events are opened is to associate these counters with the (self) process. And a span is running in the context of a process. So, when a span is inactive, probably due to some I/O call, n/w call etc, then, the process is inactive too. So, the counters won't count for those inactive periods. The active/inactive period can be inferred by looking at the cpu-cycles consumed by the span over a given period of time. I hope that addresses your concerns.
Hi @bensigelman to add to the above comment, I want to add something more here. There are multiple ways on how we can start a perf event within a span.
One is to create the event for a process (as mentioned in the above comment), wherein, the event will be responsible to count only for the process that is being monitored. Whenever, the process schedules out of the cpu, the event stops counting.
There are other useful perf events which can be used to monitor the system, and is not necessarily attached to a single process. These kind of events need root access to be created. What this implies is they will need a different way to be created than the above approach. We will need to look into this to find out a way to monitor these events too.
With that said, the provided commits and git branches in comment no.1 implement the first approach of opening events for the self process (which works as it is, we don't need any special access permissions).
Also, note that there is a certain overhead in creating and reading these events. One simple example that we tried over multiple runs with a simple span with and without monitoring a perf event : Time spent in the span without perf event ~ 8 us (microseconds) Time spent in the span with a perf event ~ 17 us (microseconds).
Thoughts/Suggestions?
@hkshaw1990 sorry for the delay, I'm too busy these days :-/
For languages that support/encourage concurrency (Go, Java, C++, arguably Node.js), process-level measurements are better than nothing but not as helpful as stats for the current thread / goroutine / etc.
As for actually implementing this: it seems as though you were imagining this would be turned on or off on a per-Span basis. Another approach might be to enable it for the entire Tracer, perhaps by creating a Tracer adapter that wraps an existing implementation and adds these tags to all Spans, or perhaps to all Spans that take longer than X µs. I might suggest that instead, as users would presumably be more interested in these numbers for Spans that took too long.
As discussed with @bensigelman (please point out if I missed something) on gitter, here is a brief summary : The perfevents are started at the beginning of the span (if specified) and closed at the end of the span. This gives us the data for that span. However, another span (or spans) can be started on the same machine by the root span (for which some perfevents are created). This will cause the perfevents to also count for the new spans and account that events data in the parent span. This will be consistent with the time (latency) taken by the span but it doesn't give a precise perfevents data for that span (since, there might be double counting). Another suggestion from @bensigelman was to log the perfevents data only when a span crosses a certain threshold latency. That way the users won't be overwhelmed by the amount of data they log. Rather, they can restrict it to spans which cause real delays.
The above suggestions are being planned to be taken care of in the subsequent versions. Have sent some basic PRs to support perfevents to these repos : https://github.com/opentracing/opentracing-go/pull/135 https://github.com/openzipkin/zipkin-go-opentracing/pull/50 A new repo has been created (thanks to @bensigelman) to track the perfevents package : https://github.com/opentracing-contrib/perfevents and here is the first PR to support basic hardware events : https://github.com/opentracing-contrib/perfevents/pull/1