velox icon indicating copy to clipboard operation
velox copied to clipboard

Add ability to verify expression fuzzer runs on a subset of rows

Open bikramSingh91 opened this issue 4 months ago • 2 comments

Summary: Currently, the expression fuzzer has a phase where it re-runs rows that did not throw an error to ensure evaluation is consistent for them. To achieve this, it currently wraps the inputs with a dictionary that only points to the subset of those rows. This results a change in the encoding of inputs which can cause differences in eval paths taken between phases. To address this and ensure the same paths are taken for each evaluation phase, this change introduces the ability for the expression verifier to only verify a subset of the input rows. The aforementioned fuzzer run phase can only specify the non error rows and maintain the original input row.

Follow up: After this change, it would be relevant to also store the input selectivity vector. A subsequent change will be added that would add this ability and make corresponding changes to the ExpressionRunner

Differential Revision: D64366745

bikramSingh91 avatar Oct 15 '24 21:10 bikramSingh91