elasticsearch-ruby
elasticsearch-ruby copied to clipboard
Adding tracing?
Hi! I'm working on instrumenting this library so that it can generate OpenTelemetry trace spans (or likely instrumenting -transport or -api, depending on how things go 😄 ). I was hoping for a little guidance from this repo's maintainers!
Some background: for a lot of gems that we instrument in opentelemetry-ruby, we just monkey patch the library's classes directly. Check out Bunny as an example (here's where we do the monkey patching). The monkey patching strategy works very well when there's a short list of choke points to instrument.
The logical choke point in elasticsearch-ruby and its related gems is perform_request, AFAICT. This method sends an HTTP request to the Elasticsearch instance. Within perform_request method, we know the path that’s being requested, the request method, the request params, and the body of the request. This makes for a great trace! The only downside is that there’s no semantically meaningful operation name available (e.g. “create_index” or “get_doc”). If possible, we want the span name to be a meaningful, low-cardinality value (see spec):
The span name SHOULD be set to a low cardinality value representing the statement executed on the database. It MAY be a stored procedure name (without arguments), DB statement without variable arguments, operation name, etc.
As I see it, there's a few ways to generate a semantically meaningful span name:
- We could monkey patch
perform_request, then use regexes to detect commonly used operations in the path arg and basically build up a hash of{semantically_meaningful_operation_name: regex, semantically_meaningful...}and match the request path against it. This is essentially what the Python folks have done. - We could lean into Elasticsearch's REST-iness and name the span something like
Elasticsearch: GET /myindex/_doc/:id - We could monkey patch each method within
elasticsearch-apito add a trace. This seems pretty cumbersome to implement and would likely be a maintenance nightmare. - We could add tracing capabilities to
elasticsearch-rubyand/or its constituent gems. Ruby's graphql gem is a great example of this. This tracing could basically be a noop for all users except those that have opted in.
😅 phew that was a lot! Curious if any of these options make more sense than others to this library's maintainers/people that know more about Elasticsearch then me. I could likely contribute a patch for the last option, if the repo would accept it 😄 .
Any thoughts here @picandocodigo? Happy to keep things contained in opentelemetry-ruby (i.e. do option 1 or 2) but if option 4 is workable, that would probably be cleaner 😄
Hi @plantfansam, Sorry, I've had quite a busy couple of weeks! Let me look into this in detail and talk with the team and I'll get back to you next week, thanks for your patience!
Hi @plantfansam, Sorry, I've had quite a busy couple of weeks! Let me look into this in detail and talk with the team and I'll get back to you next week, thanks for your patience!
No problem! Appreciate you taking a look 😄!
Hey @plantfansam, I thought I'd chime in here, after having worked on the instrumentation of this gem for the Elastic APM ruby agent.
Here is our implementation
Our instrumentation needs to be updated as it currently only supports < v8.0 of the Elasticsearch ruby client. But perhaps it's helpful to see how we used the method and path to give a name to the span.
Emily, it's been awhile! I would be great if the Elastic team could consider contributing an instrumentation adapter to the OpenTelemetry project.
With the release of elasticsearch-ruby 8.11.0, native support for OpenTelemetry is available in the client, written by @estolfo. It follows the Semantic Conventions for Elasticsearch. There's more information on this in Using OpenTelemetry. Thanks @plantfansam!