calcite icon indicating copy to clipboard operation
calcite copied to clipboard

[CALCITE-3913] Test correctness using formal verification (Qi Zhou)

Open qizhou92 opened this issue 5 years ago • 9 comments

Add dependency for z3, add two simple tests to test if native library work. I am new to this process, and I am not quite sure how much code I should push in commit to reviewing. Welcome any feedback or propose, given the relatively large number of codes that I need to implement this function. The instruction for building a native library for z3 can be found here: https://github.com/Z3Prover/z3. This instruction should be easy to follow. Please using the z3-4.41 version to build the native library. Add symbolic columns, and convert simple rex node to symbolic columns

qizhou92 avatar Oct 29 '20 23:10 qizhou92

I guess maybe @vlsi can give a good review ~

danny0405 avatar Oct 30 '20 03:10 danny0405

@qizhou92 , thanks for the PR.

I am not quite sure how much code I should push in commit to reviewing

That differs on a case-by-case basis.

For z3 integration, I would suggest having at least one Calcite-specific test implemented. In other words, currently, all the added z3 tests verify only that z3 loads.

It would be nice if you could add a test that verifies if the two RexNodes are equivalent or not (or whatever was your intention with z3).

vlsi avatar Oct 30 '20 09:10 vlsi

Thank you @vlsi for your review. I update the code based on your comments. I move the Smtlib.test file out of the verifier folder so that I can reuse some of the testing infra there. I create two test cases to checking if two RexNodes are equivalent. Do you think it is the right path?

qizhou92 avatar Nov 02 '20 22:11 qizhou92

@qizhou92 , this looks nice. Have you seen RexFuzzer, RexProgramFuzzyTest, org.apache.calcite.rex.RexSimplify#verify ?

I wonder if z3 verifications could be integrated to RexSimplify#verify. Of course, it would be limited to int or something, but it still might be valuable.

An alternative option is to add a dedicated case to RexProgramFuzzyTest which would generate int-only randomized expressions (e.g. via RexFuzzer), and verify if RexSimplfy does not corrupt them.

vlsi avatar Nov 02 '20 23:11 vlsi

@vlsi , Yes, it can be used for RexSimplify#verify as the first use cases. I will add more codes to support SPJ RelNode, aggregate node, union node, outer join node, step by step.

qizhou92 avatar Nov 03 '20 00:11 qizhou92

Hi, I implement the function that verifies the equivalence of two rex node, including numerical value, and a boolean value. I use some test cases previously in RexProgromTest to verify the correctness of RexSimplify. I can keep working support more constructors such as CASE, or more types, such as DATE. another question I have is I found there is a class is called RexImplicationChecker, Z3 can do a better job to check the implication between two RexNode. For example, the current RexImplicationChecker seems cannot verify that Not (a <=30 ) ==> a > 10. I am wondering If that would be a good use for z3.

qizhou92 avatar Nov 12 '20 21:11 qizhou92

I guess we want to keep RexImplicationChecker, so we don't want to use slow approaches like z3 there. On the other hand, z3 might be helpful for testing RexImplicationChecker.

vlsi avatar Nov 12 '20 21:11 vlsi

Thank you ! It might be a stupid question. but what is the time requirement in general for checking RexNode implication, like ~10 milliseconds? ~1 milliseconds? ~0.1 milliseconds? Just want to get a sense.

qizhou92 avatar Nov 13 '20 02:11 qizhou92

but what is the time requirement in general for checking RexNode implication, like ~10 milliseconds? ~1 milliseconds? ~0.1 milliseconds? Just want to get a sense.

The expectation is that both RexSimplify and RexImplicationChecker have linear complexity with respect to the input subtree size (e.g. the number of nodes).

vlsi avatar Nov 13 '20 10:11 vlsi