aerospike-client-csharp icon indicating copy to clipboard operation
aerospike-client-csharp copied to clipboard

Add tracing

Open ogxd opened this issue 2 years ago • 5 comments

Context

Tracing is a very powerful concept for modern backends and found its way as a basic of observability as logs and metrics are today.

In dotnet, distributed tracing was pushed forward in .NET 6 with the introduction of the System.Diagnostics.Activity APIs in the base class library. See this article and the official documentation.

Instrumentation was progressively introduced for the most common protocols over the last two years, starting with the HttpClient and following with Grpc.Net.Client (see some example from the source code).
Given the critical place Aerospike spike has in many backends, it would be more than welcome for this client to implement the Activity APIs.

How

The Activity API is quite barebone and gives a lot of freedom as to how to use them. OpenTelemetry (part of the CNCF) proposes some naming conventions, which are already widely adopted by the industry (Grpc.Net.Client uses them).

  • The base idea is to create a new Activity whenever there is an I/O (in the case of retries, it means several Activities will be created).
  • Tags must be added to add some information such as: which node was called? What kind of operation is it? What is the key? Is it the first attempt? ...
  • Activity.Current uses AsyncLocal. It's an important detail to understand how this API works under the hood
  • In case of error, the ongoing Activity state must be marked as error
  • When there is no listener, ActivitySource.StartActivity(...) will return null. That means that this won't affect performances in any way if there is no trace/listener ongoing (it will make no difference for people that don't use tracing)

ogxd avatar Mar 29 '23 15:03 ogxd

See PR #80

ogxd avatar Mar 29 '23 15:03 ogxd

We like this idea. We will put it in our next release, pending our testing of it

shannonklaus avatar Mar 30 '23 16:03 shannonklaus

Awesome!
Here is an example of what it looks like with the PR (we have been testing this fork on our end).
image This UI is from Grafana Tempo but it would be similar in all other solutions for visualizing traces that support the OpenTelemetry semantic conventions (Datadog APM, Jaeger, LightStep, Zipkin, ...)

ogxd avatar Mar 31 '23 12:03 ogxd

That's pretty cool, any plan to include this ? @shannonklaus ?

guillaumed-unity avatar Oct 17 '23 17:10 guillaumed-unity

Yes, I am still planning on including this at some point, it has been out prioritized by other features, but it is still on my radar

shannonklaus avatar Oct 17 '23 18:10 shannonklaus