cockroach
cockroach copied to clipboard
sql: correct error when logging timed out and cancelled queries
Note to reviewers: only the latest commit should be reviewed in this PR.
Setting a query timeout or cancellation error occurs in execStmtInOpenState, while logging occurs in dispatchToExecutionEngine (called on by the former). This previously resulted in the incorrect pg error code being surfaced when logging cancelled queries. This commit moves statement logging from dispatchToExecutionEngine to execStmtInOpenState so that we have the correct error information at the time of logging.
Preserving pausible portal logging behaviour:
There are some nuances in logging when we have a pausable portal that
must be preserved. For pausible portals, logging was triggered as a cleanup
action in execStmtInOpenState by appending it to a list of cleanup fns
in dispatch. This is now accomplished by performing logging in the
processCleanupFunc helper within execStmtInOpenState.
Epic: none Fixes: https://github.com/cockroachdb/cockroach/issues/122722
Release note (bug fix): Fixes a bug in logging where error code was misreported for cancelled queries. Affected channels: SQL_PERF (slow query logs), SQL_EXEC (sql exec logs).
Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.
:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.
pkg/sql/conn_executor_exec.go line 853 at r4 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
on the main issue, i had a similar instinct: https://github.com/cockroachdb/cockroach/issues/122722#issuecomment-2073150240
i feel like it would be less disruptive, which is especially relevant here since we'd want this fix backported. (so it should be the smallest diff we can achieve.) i'm happy to discuss the challenges with that here or in person.
@rafiss Ah yeah I ended up doing both here (moving the logging to execStmt and also moving the error setting logic). Logging already existed in execStmtInOpenState for statements that don't get dispatched to the execution engine. I was not sure if the order of operations in execStmtInOpenState was significant w.r.t logging so I kept logging where it currently was and moved the error writing there. Perhaps moving the logging to happen after query cancellation is innocuous though, and then we can keep the error writing logic as is. I'm going to try that and see if anything breaks :)
pkg/sql/conn_executor_exec.go line 853 at r4 (raw file):
Previously, xinhaoz (Xin Hao Zhang) wrote…
@rafiss Ah yeah I ended up doing both here (moving the logging to execStmt and also moving the error setting logic). Logging already existed in execStmtInOpenState for statements that don't get dispatched to the execution engine. I was not sure if the order of operations in execStmtInOpenState was significant w.r.t logging so I kept logging where it currently was and moved the error writing there. Perhaps moving the logging to happen after query cancellation is innocuous though, and then we can keep the error writing logic as is. I'm going to try that and see if anything breaks :)
There are a few things that make moving logging within execStmtInOpenState to occur after the error setting more difficult than the current solution of moving the error setting to the logging.
Currently the order of operations is (omitting ops irrelevant to this situation): log, finish instrumentation helper (which finishes the tracing span used for some logging operations), set the query cancellation and timeout errors + cancel the query (these last 2 things happen in the same deferred function).
The order above needs to be preserved as far as I've been able to tell since logging relies on the I.H. not being cleaned up, and parts of the instrumentation helper cleanup rely on the query's context still being active.
It seems to me like it's easier to move the error setting logic, but I might be missing some context - @rafiss is there a specific reason it needs to occur right before query cancellation? I also noticed setting a query timeout error is not in the processCleanupFn for pausible portals - what's important there in terms of the order of operations that need to be preserved for pausible portals? I've currently moved it to be within the processCleanupFn for logging but can mirror the old behaviour with some additional comments if given the context.
@rafiss @dhartunian This is ready for another look. I kept the logging where it is for the reasons above. Ended up extracting the error handling stuff into its own function, trying to preserve all of the existing logic but just moving it to happen before logging in execStmtInOpenState. LMK if you have any questions want to talk about this further in person.
TFTR!! bors r+
Build succeeded:
Encountered an error creating backports. Some common things that can go wrong:
- The backport branch might have already existed.
- There was a merge conflict.
- The backport branch contained merge commits.
You might need to create your backport manually using the backport tool.
error creating merge commit from eac197136e3b75b18cd7739c9c6bccbc18d2a46d to blathers/backport-release-23.1-124371: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []
you may need to manually resolve merge conflicts with the backport tool.
Backport to branch 23.1.x failed. See errors above.
error creating merge commit from eac197136e3b75b18cd7739c9c6bccbc18d2a46d to blathers/backport-release-23.2-124371: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []
you may need to manually resolve merge conflicts with the backport tool.
Backport to branch 23.2.x failed. See errors above.
error creating merge commit from eac197136e3b75b18cd7739c9c6bccbc18d2a46d to blathers/backport-release-24.1-124371: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []
you may need to manually resolve merge conflicts with the backport tool.
Backport to branch 24.1.x failed. See errors above.
:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.
Will this be getting a backport?