vitess icon indicating copy to clipboard operation
vitess copied to clipboard

DISTINCT bug

Open systay opened this issue 1 year ago • 1 comments

Description

Adds a failing end to end test

Vitess Results:
[VARCHAR("test") DECIMAL(2)]
[VARCHAR("test") DECIMAL(1.5000)]
[VARCHAR("d") DECIMAL(2.0000)]
[VARCHAR("d") DECIMAL(2)]
[VARCHAR("c3") DECIMAL(4.0000)]
[VARCHAR("c3") DECIMAL(4)]
[VARCHAR("c") DECIMAL(4)]
[VARCHAR("c") DECIMAL(4.0000)]
[VARCHAR("b") DECIMAL(3.0000)]
[VARCHAR("b") DECIMAL(3)]
[VARCHAR("Abc") DECIMAL(2.0000)]
[VARCHAR("Abc") DECIMAL(2)]
[VARCHAR("a") DECIMAL(1.5000)]
[VARCHAR("a") DECIMAL(2)]


MySQL Results:
[VARCHAR("test") DECIMAL(2.0000)]
[VARCHAR("test") DECIMAL(1.5000)]
[VARCHAR("d") DECIMAL(2.0000)]
[VARCHAR("c3") DECIMAL(4.0000)]
[VARCHAR("c") DECIMAL(4.0000)]
[VARCHAR("b") DECIMAL(3.0000)]
[VARCHAR("Abc") DECIMAL(2.0000)]
[VARCHAR("a") DECIMAL(2.0000)]
[VARCHAR("a") DECIMAL(1.5000)]

We are doing DISTINCT on these results, and it seems to me like we don't consider DECIMAL(4.0000) equal to DECIMAL(4).

Related Issue(s)

Checklist

  • [ ] "Backport to:" labels have been added if this change should be back-ported to release branches
  • [ ] If this change is to be back-ported to previous releases, a justification is included in the PR description
  • [ ] Tests were added or are not required
  • [ ] Did the new or modified tests pass consistently locally and on CI?
  • [ ] Documentation was added or is not required

Deployment Notes

systay avatar Feb 23 '24 09:02 systay

Review Checklist

Hello reviewers! :wave: Please follow this checklist when reviewing this Pull Request.

General

  • [ ] Ensure that the Pull Request has a descriptive title.
  • [ ] Ensure there is a link to an issue (except for internal cleanup and flaky test fixes), new features should have an RFC that documents use cases and test cases.

Tests

  • [ ] Bug fixes should have at least one unit or end-to-end test, enhancement and new features should have a sufficient number of tests.

Documentation

  • [ ] Apply the release notes (needs details) label if users need to know about this change.
  • [ ] New features should be documented.
  • [ ] There should be some code comments as to why things are implemented the way they are.
  • [ ] There should be a comment at the top of each new or modified test to explain what the test does.

New flags

  • [ ] Is this flag really necessary?
  • [ ] Flag names must be clear and intuitive, use dashes (-), and have a clear help text.

If a workflow is added or modified:

  • [ ] Each item in Jobs should be named in order to mark it as required.
  • [ ] If the workflow needs to be marked as required, the maintainer team must be notified.

Backward compatibility

  • [ ] Protobuf changes should be wire-compatible.
  • [ ] Changes to _vt tables and RPCs need to be backward compatible.
  • [ ] RPC changes should be compatible with vitess-operator
  • [ ] If a flag is removed, then it should also be removed from vitess-operator and arewefastyet, if used there.
  • [ ] vtctl command output order should be stable and awk-able.

vitess-bot[bot] avatar Feb 23 '24 09:02 vitess-bot[bot]

Needs backporting to 18.0 and 19.0, presumably? Caused by https://github.com/vitessio/vitess/pull/15192?

GrahamCampbell avatar Feb 26 '24 13:02 GrahamCampbell

Are there similar issues with group by, or is this just an issue with distinct?

GrahamCampbell avatar Feb 26 '24 13:02 GrahamCampbell

@systay OK here's a more-or-less working fix for this (which includes several related cleanups, particularly removing evalengine APIs which we don't need anymore). You'll see that some plans have changed: we're not asking for weight_string to MySQL in several union cases. I think that's a good thing but I'd like to review it with you next week.

vmg avatar Feb 28 '24 11:02 vmg

Codecov Report

Attention: Patch coverage is 90.50633% with 15 lines in your changes are missing coverage. Please review.

Project coverage is 65.72%. Comparing base (6c73053) to head (ee0f9b4). Report is 2 commits behind head on main.

Files Patch % Lines
go/vt/vtgate/engine/concatenate.go 93.75% 3 Missing :warning:
go/vt/vtgate/engine/distinct.go 86.36% 3 Missing :warning:
go/vt/vtgate/evalengine/eval_numeric.go 66.66% 2 Missing :warning:
...o/vt/vtgate/planbuilder/operators/union_merging.go 71.42% 2 Missing :warning:
go/vt/vtgate/semantics/table_collector.go 77.77% 2 Missing :warning:
go/vt/vtgate/evalengine/api_type_aggregation.go 95.23% 1 Missing :warning:
go/vt/vtgate/evalengine/compiler.go 95.00% 1 Missing :warning:
go/vt/vtgate/evalengine/eval.go 95.00% 1 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #15340      +/-   ##
==========================================
- Coverage   65.72%   65.72%   -0.01%     
==========================================
  Files        1563     1562       -1     
  Lines      194027   193952      -75     
==========================================
- Hits       127529   127468      -61     
+ Misses      66498    66484      -14     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Mar 07 '24 17:03 codecov[bot]

Fixing this bug was way more involved than we initially expected! The PR is green now and the planner changes have been reviewed with @systay. Basically, we were not properly performing type aggregation nor coercion between the two sides of UNION queries. The results for queries like that were only correct when the two sides of the union have the same types (something which is quite common, which is why we hadn't found out yet), but with different types the results are off.

vmg avatar Mar 08 '24 09:03 vmg

Updated title and description with an actual list of changes for this PR. :sparkle:

vmg avatar Mar 08 '24 09:03 vmg

Are there similar issues with group by, or is this just an issue with distinct?

We didn't find any such issues, but you are correct in suspecting DISTINCT. It also needs to compare values. ORDER BY that is performed on the vtgate side would also be exposed to the same underlying problem.

This fix solved the issue in the underlying evalengine component, so it should apply to DISTINCT, GROUP BY and ORDER BY.

systay avatar Mar 08 '24 10:03 systay

@GrahamCampbell :point_up: what @systay said -- sorry I missed your question! The GitHub UI collapsed it. :sweat_smile:

vmg avatar Mar 08 '24 11:03 vmg

Needs backporting to 18.0 and 19.0, presumably? Caused by #15192?

This is not related to those changes and I don't think we need to backport here.

dbussink avatar Mar 08 '24 13:03 dbussink

Aye this is unrelated to any previous issue, and it's not blocking any users or customers afaict. The report was filed by @systay himself and he found the problem by fuzzing.

vmg avatar Mar 08 '24 14:03 vmg