opentelemetry-js-contrib
opentelemetry-js-contrib copied to clipboard
perf(instrumentation-pg): do not split query to determine operation name
Which problem is this PR solving?
OTEL allocates too much
Short description of the changes
To determine span name from query text currently query is split by whitespace. With typical SQL statement being many tens of words long this is wasteful. Replaced the logic with finding the index of first whitespace. This can be optimize further to avoid double slicing for COMMIT case, but this turns out to have a lot of indexing and conditions gymnastics
curious, do we care about the case WITH ...? Because the name of the operation would be just WITH, which doesn't seem very useful
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 90.38%. Comparing base (
dfb2dff) to head (81dc59d). Report is 135 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #2029 +/- ##
==========================================
- Coverage 90.97% 90.38% -0.60%
==========================================
Files 146 147 +1
Lines 7492 7506 +14
Branches 1502 1573 +71
==========================================
- Hits 6816 6784 -32
- Misses 676 722 +46
| Files | Coverage Δ | |
|---|---|---|
| ...node/opentelemetry-instrumentation-pg/src/utils.ts | 97.22% <100.00%> (+0.25%) |
:arrow_up: |
curious, do we care about the case
WITH ...? Because the name of the operation would be justWITH, which doesn't seem very useful
I personally do not find all this logic and span names very helpful in my work, I will create issue, seems better to discuss it via that
This issue is discussed in semantic conventions for database spans here:
The span name SHOULD be set to a low cardinality value representing the statement executed on the database. It MAY be a stored procedure name (without arguments), DB statement without variable arguments, operation name, etc. Since SQL statements may have very high cardinality even without arguments, SQL spans SHOULD be named the following way, unless the statement is known to be of low cardinality: <db.operation.name> <db.namespace>.<db.collection.name>, provided that db.operation.name and db.collection.name are available. If db.collection.name is not available due to its semantics, the span SHOULD be named <db.operation.name> <db.namespace>.
It is not recommended to attempt any client-side parsing of db.query.text just to get these properties, they should only be used if the library being instrumented already provides them. When it's otherwise impossible to get any meaningful span name, db.namespace or the tech-specific database name MAY be used.
So I support removing any parsing of the query to be compliant with spec and to not introduce any extensive processing.
Perhaps the span name should be in this case the dbName?