opentelemetry-ruby-contrib icon indicating copy to clipboard operation
opentelemetry-ruby-contrib copied to clipboard

feat: Support prepend SQL comment for PG instrumentation

Open keiko713 opened this issue 1 year ago • 14 comments

Currently, extract_operation method is simply extracting the first word before the space as an operation (as commented, this was borrowed from opentelemetry-js-contrib's pg instrument).

This works okay for most of queries, however, when Active Record Query Logs (known as marginalia) is used with the option of ActiveRecord::QueryLogs.prepend_comment = true, the query will start with a query comment and this extracting logic fails, making a span name as simply database name and leaving the operation empty.

This PR adds a simple logic to ignore any prepend query comment (with using /* comment */ way, not supporting -- comment type comment). As Active Record Query Logs became part of Rails, we may see more and more folks use this likely with prepend_comment = true option, so it would be nice to support this with some minimal change.

Fixes https://github.com/open-telemetry/opentelemetry-ruby-contrib/issues/452

keiko713 avatar Oct 12 '23 11:10 keiko713

CLA Signed

The committers listed above are authorized under a signed CLA.

  • :white_check_mark: login: keiko713 / name: Keiko Oda (676425c5300a6c2eb70f8b61df9e654226781599)
  • :white_check_mark: login: arielvalentin / name: Ariel Valentin (d9bf28137d8d6c4bfefc4f691ad7f69c11ddc921, 031fdcaaf85d6b1cb8a6551af4b7db94d38d7802)
  • :white_check_mark: login: kaylareopelle / name: Kayla Reopelle (she/her) (ca71e53f0ea46069a38b5daff6b625abcbf04b31, 515df0d6dd7af29bf9630821a51ff62f9a164b25, 063ccad4a3e18c3df3b8437cf50f6f8c1f5006b3)

I'm reading the specification for what db.operation should be, and wondering if this regex could be considered as "client-side parsing".

https://opentelemetry.io/docs/specs/otel/trace/semantic_conventions/database/#call-level-attributes

When setting this to an SQL keyword, it is not recommended to attempt any client-side parsing of db.statement just to get this property, but it should be set if the operation name is provided by the library being instrumented. If the SQL statement has an ambiguous operation, or performs more than one operation, this value may be omitted.

Given that it's a small simple one, I hope that it's okay to tweak like this (this really helps readability of the tracing so I'd love to have this personally), but I understand if that's not the case. I'd appreciate any inputs!

keiko713 avatar Oct 13 '23 05:10 keiko713

Thanks for your contribution.

We've got work already underway to address this problem in a separate PR https://github.com/open-telemetry/opentelemetry-ruby-contrib/pull/529

Would you be interested in helping us do some exploratory testing there?

arielvalentin avatar Oct 13 '23 12:10 arielvalentin

Ariel, thank you for sharing that PR. I wasn't aware of it, and I'm excited to see that there is already a PR for this! I'm more than happy to assist in any way I can.

It appears that this PR is quite big and has a lot of history. You mentioned the need for some exploratory testing, but could you please provide more details about how I can help for this? Thanks!

keiko713 avatar Oct 17 '23 11:10 keiko713

Hi @keiko713. The main outstanding issue with that PR is exactly what you've addressed here. See https://github.com/open-telemetry/opentelemetry-ruby-contrib/pull/529#discussion_r1355160854. I'd really appreciate it if you could port your fix to that PR. 🙏

fbogsany avatar Oct 17 '23 15:10 fbogsany

Thanks for the pointer, @fbogsany ! I took a look that PR deeper and now have several questions but let me see if I can keep it short.

First, let's see if I understand that PR properly:

  • The main motivation of that PR is to move shared sql behavior to helper gems (like the title said), mainly with the obfuscation feature. In that PR, it's extracted as opentelemetry-helpers-sql-obfuscation and both MySQL and Postgres (potentially more moving forward?) are using it to obfuscate SQL statements db.statement
    • This is a good refactoring, though I think this part is unrelated to my PR
  • In addition to the opentelemetry-helpers-sql-obfuscation helper gem, that PR also adds opentelemetry-helpers-mysql, which is used by two MySQL clients: mysql2 and trilogy
    • This helper contains the method somewhat related to my PR, OpenTelemetry::Helpers::MySQL.extract_statement_type
    • This extract_statement_type method takes a query and returns the statement type such as select or update
      • @fbogsany has feedback for the logic of this method (more accurately, the regexp the method is using) as described in this and this PR comments

Now, coming back to my PR and the current implementation of Postgres client pg instrumentation:

  • With pg instrumentation, currently it's using sql.to_s.split[0].to_s.upcase to get a statement type (aka db.operation)
    • This is different from what MySQL is doing, where searching a query against the list of "query names"
    • Postgres also has the list of query names-ish, SQL_COMMANDS, and the statement type (operation) will only be used as a part of span name when it's in SQL_COMMANDS (otherwise ignored)

From here, my questions would be:

  • Regarding the improvement of OpenTelemetry::Helpers::MySQL.extract_statement_type method (QUERY_NAME_REGEX) in #529 (to support query comment including query names), I'm happy to work on it. However, that PR is being worked on by @kaylareopelle and I'm unsure if I could step on her toe.
  • #529 doesn't actually fix the issue #452 for Postgres, but we could use a similar logic MySQL is using to get a statement type with Postgres too. In that case, one option is to update #529 to have it, and the other option is to change this PR to have it. Former has the same concern as above, but what would work better for the community?

I'm sorry for many questions and a long comment! 🙈

keiko713 avatar Oct 18 '23 06:10 keiko713

👋 This pull request has been marked as stale because it has been open with no activity. You can: comment on the issue or remove the stale label to hold stale off for a while, add the keep label to hold stale off permanently, or do nothing. If you do nothing this pull request will be closed eventually by the stale bot

github-actions[bot] avatar Nov 18 '23 01:11 github-actions[bot]

Hi @keiko713, I'm so sorry about the delay in responding to this PR. Thanks for your patience, your great questions, and this awesome fix.

Though the work is thematically similar to #529, as you correctly stated, the OpenTelemetry::Helpers::MySQL.extract_statement_type only applies to MySQL adapters. In my opinion, this PR should not be blocked by #529 and can be merged straightaway.

The MySQL helpers gem created in 529 is intended to reduce duplication between the trilogy and mysql2 instrumentation. Since we only instrument one Postgres gem, pg, we don't need to abstract this code out into a separate gem, nor should we try to combine it with the MySQL helpers gem.

I've also added functionality to #529 to make MySQL statement extraction respect Active Record Query Logs, so both MySQL and Postgres instrumentation via OpenTelemetry should now be compatible with this feature.

kaylareopelle avatar Nov 20 '23 19:11 kaylareopelle

I don't really understand this need. Why would you enable this comment in app to remove it later during instrumentation? :thinking:

simi avatar Nov 20 '23 23:11 simi

👋 This pull request has been marked as stale because it has been open with no activity. You can: comment on the issue or remove the stale label to hold stale off for a while, add the keep label to hold stale off permanently, or do nothing. If you do nothing this pull request will be closed eventually by the stale bot

github-actions[bot] avatar Dec 22 '23 01:12 github-actions[bot]

I don't really understand this need. Why would you enable this comment in app to remove it later during instrumentation? 🤔

@simi, the way I understand it, the comment isn't exactly being removed by the instrumentation.

Instead, the comment is ignored when trying to identify what statement type is in the query (ex. SELECT). If the comment is not ignored, the wrong statement type will be chosen, making the span name less precise.

The prepended comment will still be included in the statement if the user configured it to be included by using the default :db_statement or configuring :db_statement to :include.

kaylareopelle avatar Jan 02 '24 20:01 kaylareopelle

@kaylareopelle ahh, thanks for explanation. Now it makes sense. I'm ok to move this forward. Once there will be universal SQL helpers gem, this can be moved in there.

simi avatar Jan 11 '24 04:01 simi

👋 This pull request has been marked as stale because it has been open with no activity. You can: comment on the issue or remove the stale label to hold stale off for a while, add the keep label to hold stale off permanently, or do nothing. If you do nothing this pull request will be closed eventually by the stale bot

github-actions[bot] avatar Mar 10 '24 01:03 github-actions[bot]

@keiko713 - I think the only thing holding this PR back is to move the comment_regex to a constant.

Once that's fixed, we should be clear to merge. Thanks for your contribution and for your patience with the review process!

kaylareopelle avatar Mar 12 '24 23:03 kaylareopelle

👋 This pull request has been marked as stale because it has been open with no activity. You can: comment on the issue or remove the stale label to hold stale off for a while, add the keep label to hold stale off permanently, or do nothing. If you do nothing this pull request will be closed eventually by the stale bot

github-actions[bot] avatar May 02 '24 01:05 github-actions[bot]