brave icon indicating copy to clipboard operation
brave copied to clipboard

Propagate trace ID to database queries

Open Diagoras opened this issue 5 years ago • 18 comments

I initially opened this issue against Spring Sleuth, but after some discussion we decided it belonged over here instead. Let me know if that's wrong, though.

We've found that when troubleshooting slow queries it can sometimes be difficult to definitively link a given query back to the operation that initiated it. This is especially true for DBAs that troubleshoot problems using a database-first perspective, where they hunt down slow queries using the database's profiling tools instead of application logs.

Do you think it makes sense to prepend the trace ID as a comment on queries? That would allow database queries to be tied back to a trace ID, which in turn would link them back to the operation that produced them. For example:

-- X-B3-TraceId: b763b53c5c6ba343
SELECT * FROM foo WHERE baz = true;

Obviously, if there's a better way to associate metadata with a query than a literal comment then I'd advocate going in that direction. Please let me know what you think, and thanks for maintaining this library.

Diagoras avatar Mar 19 '19 18:03 Diagoras

Brave currently relies on https://github.com/p6spy/p6spy to trace database calls (some discussion #824 took place to see if https://github.com/r2dbc/r2dbc-proxy would be a better fit but that's beside the point). I guess such functionality would belong more there, providing a hook for brave to call and set the trace/span id. https://github.com/p6spy/p6spy/tree/master/src/main/java/com/p6spy/engine/wrapper provides a bunch of wrappers, so it seems doable to modify the statement. Have you checked that prepending a comment is portable across databases and will not break any queries ?

Definitely a nice idea !

jorgheymans avatar Mar 19 '19 19:03 jorgheymans

I would probably advise against using X-B3- as that is a bit too specific to a propagation format which might not be in use. If this is using correlation, I'd suggest traceId and spanId like our MDC integration. On that note, I suppose I agree with the implementation suggestion of @jorgheymans (using a database wrapper)

codefromthecrypt avatar Mar 19 '19 22:03 codefromthecrypt

Have you checked that prepending a comment is portable across databases and will not break any queries ?

I have not, but that's definitely a good idea! My naive view was that the comment syntax was consistent across databases and couldn't break anything. But you're right, that's something we'd want to make sure of before making this change.

Diagoras avatar Mar 20 '19 19:03 Diagoras

one funny thing is that in some areas comments themselves end up as statements (I've seen this in mysql). Which means we have to be careful not to trace the comments :P

At least in the beginning, and possibly forever, I think this should be an opt-in filter based approach so that sites can practice getting it right. It might be small enough to copy/paste until such experience accrues.

codefromthecrypt avatar Mar 21 '19 00:03 codefromthecrypt

Any value in not doing a generic approach like using the comment field, but rather trying to hook into the specific auditing mechanisms of each database type directly ? Just thinking out loud ...

jorgheymans avatar Mar 25 '19 09:03 jorgheymans

@jorgheymans Yeah, that would be awesome! I mentioned in my initial comment that if there was a better way to tie metadata (like a trace ID) to a SQL query then I'm all for going with that approach.

What specific auditing mechanism were you thinking of, here? I'm mostly familiar with Postgres which (AFAIK) doesn't have one, but if others do then that would be a really good integration.

Diagoras avatar Apr 04 '19 16:04 Diagoras

mysql binlog can log sql comment

brothertao avatar Apr 24 '20 14:04 brothertao

@brothertao can you give a java example using mysql interceptor? I think this is a neat idea

codefromthecrypt avatar Apr 24 '20 23:04 codefromthecrypt

@adriancole i use mybatis plugin to interceptor dml sql,add trace id to sql comment u also can intercept sql in jdbc api i think

brothertao avatar Dec 16 '20 04:12 brothertao

if you have a sample project please link it back. It is possible we can do this, but we'd have to choose what to add. Probably traceId=abcd because this decoupled from propagation. Ex in logs we use the traceId key.

I've found something recent about a system who uses yaml for payload because it permits comments. They apparently use comments for traceIds also. I will try to find it.

In any case this seems a correlation feature and without any special field (JDBC wouldn't know but maybe mysql does? no idea), a practice of opt-in on comments might make sense. Probably as you said a filter as people will likely be careful to not add comments to.. comment statements :D

but stepping back, please post an example of what works for you. I think this is basically special-casing what can already be done with filters.

codefromthecrypt avatar Dec 16 '20 04:12 codefromthecrypt

   // add traceid to sql comment
    String sql = boundSql.getSql();

    if (mappedStatement.getSqlCommandType() != SqlCommandType.SELECT) {
      sql = binlogTraceService.generateTraceComment(sql);
    }

brothertao avatar Dec 16 '20 06:12 brothertao

@brothertao what is the literal SQL you would generate.. curious

codefromthecrypt avatar Dec 16 '20 06:12 codefromthecrypt

like this:

/*__trace_start__{"traceId":"159aad5c8bc089","userId":0}__trace_end__*/ insert into 
 msg_log (`id`,`data`) values (?,?)

brothertao avatar Dec 16 '20 08:12 brothertao

thanks @brothertao and how do you use this data on the other side? grep? or something different?

codefromthecrypt avatar Dec 16 '20 08:12 codefromthecrypt

like this:

https://dba.stackexchange.com/questions/101651/show-sql-statements-from-mysql-binlog-gtid

brothertao avatar Dec 16 '20 08:12 brothertao

Some database systems create and cache query plans based on the query matching and it's not certain they're sophisticated enough or willing to take the performance hit to do a semantic match skipping comments that are dynamic. Just something to keep in mind.

alecl avatar Dec 24 '20 13:12 alecl

right, if someone did something like this it would have to be opt-in. even our cassandra propagation is opt-in. https://github.com/openzipkin/brave-cassandra/blob/master/cassandra-driver/src/main/java/brave/cassandra/driver/CassandraClientTracing.java#L140

I'm setting this help-wanted as personally won't be doing this feature

codefromthecrypt avatar Dec 24 '20 23:12 codefromthecrypt

Some database systems create and cache query plans based on the query matching and it's not certain they're sophisticated enough or willing to take the performance hit to do a semantic match skipping comments that are dynamic. Just something to keep in mind.

Just trace update, insert, delete

brothertao avatar Dec 25 '20 02:12 brothertao