cockroach
cockroach copied to clipboard
release-22.1: opt: fix node-crashing panics with correlated With exprs
Backport 2/2 commits from #88396.
/cc @cockroachdb/release
opt: prevent apply-join panics from crashing nodes
Previously, it was possible for the execution of an apply-join to crash
a node due to an uncaught optimizer panic when calling the
planRightSideFn closure. This closure is invoked for every input row
to the apply-join. It replaces variables in the expression on the right
side of the join with constants using Factory.CopyAndReplace, which
can panic. This panic won't be caught by the panic-catching logic in
Optimizer.Optimize because the closure is invoked outside the context
of Optimizer.Optimize - it's occurring during execution instead.
This commit copies the panic-catching logic of Optimizer.Optimize to
the apply-join's planRightSideFn closure to ensure that any panics are
caught.
Release Note (bug fix): A bug has been fixed that could cause nodes to crash in rare cases when executing apply-joins in query plans.
opt: fix transitive references to With exprs in RHS of apply-join
This commit fixes an error that could occur when the RHS of an apply-join referenced a With expression transitive through another With expression. The error occurred because the optimizer could not access the relational properties of the transitively referenced With expression because the With was not added to the metadata. The commit fixes the issue by adding all With expressions to the metadata if any With expressions are referenced.
Fixes #87733
Release note (bug fix): A bug has been fixed that caused errors in rare cases when executing queries with correlated WITH expressions. This bug was present since correlated WITH expressions were introduced in v21.2.0.
Release justification: This fixes a rare bug that can crash nodes.
Thanks for opening a backport.
Please check the backport criteria before merging:
- [ ] Patches should only be created for serious issues or test-only changes.
- [ ] Patches should not break backwards-compatibility.
- [ ] Patches should change as little code as possible.
- [ ] Patches should not change on-disk formats or node communication protocols.
- [ ] Patches should not add new functionality.
- [ ] Patches must not add, edit, or otherwise modify cluster versions; or add version gates.
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
- [ ] There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
- [ ] The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
- [ ] New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters.
- [ ] The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.
Add a brief release justification to the body of your PR to justify this backport.
Some other things to consider:
- What did we do to ensure that a user that doesn’t know & care about this backport, has no idea that it happened?
- Will this work in a cluster of mixed patch versions? Did we test that?
- If a user upgrades a patch version, uses this feature, and then downgrades, what happens?
TFTRs!