Sqlite explain graph
Improve diagnostic log message of sqlite explain processing to illustrate HOW the explain planner arrives to the output data types it finds. Then add a few simple fixes found with the improved diagnostic view.
Example command to get the dot format graphviz:
RUST_LOG="sqlx::explain=trace" DATABASE_URL=sqlite://tests/sqlite/sqlite.db cargo test --features sqlite,runtime-tokio,tls-rustls,chrono -- --nocapture update
Then the graphviz markup can be copy-pasted into the graphviz renderer/editor of choice. The process isn't elegant, but it works.
Warning: this only actually works when running a program which directly uses the 'describe' operation (such as sqlx's sqlite unit tests). Logger output is not available when compiling a library USING sqlx (the output is ignored - probably by rustc itself?).
Does your PR solve an issue?
Fixes test cases derived from #2939 & #1923 (which involves breaking changes)
Note: do not merge until benchmarked. I noticed the sqlite describe 'recursive' unit test taking a bit of time, so it's quite possible this branch has a performance regression.
@tyrelr you should be able to convert this to a draft PR so we don't merge it by accident.
It does indeed slow things down. I'll need to profile it at some point when I have time see where it was (hoping that it's not the tracking of de-duplicated branches).
group 02f196b4 4f04859e
----- -------- --------
insert/fks 1.00 386.3±16.65µs ? ?/sec 1.04 401.1±8.99µs ? ?/sec
insert/returning 1.00 377.1±15.80µs ? ?/sec 1.08 408.9±4.65µs ? ?/sec
select/recursive 1.00 9.7±0.51ms ? ?/sec 2.17 21.0±0.11ms ? ?/sec
select/trivial 1.00 227.5±11.83µs ? ?/sec 1.00 227.9±10.64µs ? ?/sec
The slow down was from data captured by the logger even when logging was disabled. Fixing that, the performance is comparable to before when not logging (and arguably improved).
group 02f196b4 4f04859e 77d4f88a
----- -------- -------- --------
insert/fks 1.04 386.3±16.65µs ? ?/sec 1.08 401.1±8.99µs ? ?/sec 1.00 370.4±8.18µs ? ?/sec
insert/returning 1.05 377.1±15.80µs ? ?/sec 1.14 408.9±4.65µs ? ?/sec 1.00 357.8±5.12µs ? ?/sec
select/recursive 1.36 9.7±0.51ms ? ?/sec 2.95 21.0±0.11ms ? ?/sec 1.00 7.1±0.06ms ? ?/sec
select/trivial 1.02 227.5±11.83µs ? ?/sec 1.02 227.9±10.64µs ? ?/sec 1.00 224.0±10.42µs ? ?/sec
Performance is comparable after some code cleanup
group 02f196b4 98bc9076
----- -------- --------
insert/fks 1.01 372.6±4.78µs ? ?/sec 1.00 369.3±5.90µs ? ?/sec
insert/returning 1.02 362.1±5.47µs ? ?/sec 1.00 353.8±3.54µs ? ?/sec
select/recursive 1.29 9.3±0.08ms ? ?/sec 1.00 7.2±0.02ms ? ?/sec
select/trivial 1.00 227.7±9.29µs ? ?/sec 1.00 227.1±10.82µs ? ?/sec
This is really cool, but does the graph generation always run? Can it be made optional, maybe with a specific describe*() variant, or maybe checking if the sqlx::explain span is enabled at trace? Or maybe instead of logging the graphviz output, it could save it to a file with a path given by an environment variable?
Fixes test cases derived from https://github.com/launchbadge/sqlx/issues/2939 & https://github.com/launchbadge/sqlx/issues/1923 (which involves breaking changes)
So to clarify this is a breaking PR?
This is really cool, but does the graph generation always run?
The generation is tied to the logging level of "sqlx::explain". So both the capture of the data it needs, and the calculations/formatting of the output are behind log_level checks.
Unfortunately, based on my recollection it IS possible to 'enable logging' during compilation time... using RUST_LOG="sqlx::explain=trace" (or RUST_LOG="sqlx=trace", or RUST_LOG="trace") makes the code TRY to log, but I believe rustc discards STDERR/STDOUT from proc macro's, so enabling logging doesn't actually mean you get to SEE the logs...
Can it be made optional, maybe with a specific
describe*()variant, or maybe checking if thesqlx::explainspan is enabled attrace?
That makes sense, I've changed it to require trace instead of the statement log level.
Logging is a very poor location for this. The primary situation this graph will be useful is when a query result type doesn't match developer expectations, and their crate is now failing compilation. As-is, that means either stubbing out a bunch of code, or creating a new crate using exactly the same sqlx version, query, and database to call describe with trace logging enabled...
Or maybe instead of logging the graphviz output, it could save it to a file with a path given by an environment variable?
A single file-path might get weird, since a single compilation will hit hundreds of queries.
But long-term, I would like to store this 'query result type evidence' alongside the offline query results. Then enhance sqlx-cli with an 'inspect' command to show types and the evidence. Of course, I have NO IDEA if any other database types have any helpful evidence information they might want to provide. But that's a huge change, so just tweaking the logs was an attempt to avoid boiling the ocean.
So to clarify this is a breaking PR?
Yes, the PR as a whole is a breaking change. With a bit of effort, it could be split into 2 PR's to land the strictly-logging-related stuff without breaking changes, and the analysis fixes with breaking changes. But in my mind the features is so niche & relevant to query analysis correctness, that it made sense to just leave them tied together. Especially given the size of the change 'just to get diagnostics'