lean-mlir

lean-mlir copied to clipboard

Reame
Issues

chore: drop custom bitvector notation

Open alexkeizer opened this issue 3 months ago • 12 comments

This PR removes the custom notation we had for signed & unsigned comparisons of bitvectors.

We still need to regenerate the extracted goals

Aug 29 '25 13:08 alexkeizer

This looks like a great idea. But why do we have so much snakemake here?

Aug 29 '25 13:08 tobiasgrosser

This looks like a great idea. But why do we have so much snakemake here?

Because I picked the wrong place to start my branch, it's now rebased

Aug 29 '25 13:08 alexkeizer

bv_decide solved 0 theorems. bitwuzla solved 0 theorems. bv_decide found 0 counterexamples. bitwuzla found 0 counterexamples. bv_decide only failed on 0 problems. bitwuzla only failed on 0 problems. both bitwuzla and bv_decide failed on 0 problems. In total, bitwuzla saw 0 problems. In total, bv_decide saw 0 problems. ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS The InstCombine benchmark contains 4520 theorems in total. Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan mean of percentage stddev/av: nan%

Aug 29 '25 14:08 github-actions[bot]

bv_decide solved 0 theorems. bitwuzla solved 0 theorems. bv_decide found 0 counterexamples. bitwuzla found 0 counterexamples. bv_decide only failed on 0 problems. bitwuzla only failed on 0 problems. both bitwuzla and bv_decide failed on 0 problems. In total, bitwuzla saw 0 problems. In total, bv_decide saw 0 problems. ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS The InstCombine benchmark contains 4520 theorems in total. Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan mean of percentage stddev/av: nan%

Aug 29 '25 14:08 github-actions[bot]

bv_decide solved 0 theorems. bitwuzla solved 0 theorems. bv_decide found 0 counterexamples. bitwuzla found 0 counterexamples. bv_decide only failed on 0 problems. bitwuzla only failed on 0 problems. both bitwuzla and bv_decide failed on 0 problems. In total, bitwuzla saw 0 problems. In total, bv_decide saw 0 problems. ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS The InstCombine benchmark contains 4520 theorems in total. Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan mean of percentage stddev/av: nan%

Aug 29 '25 14:08 github-actions[bot]

bv_decide solved 0 theorems. bitwuzla solved 0 theorems. bv_decide found 0 counterexamples. bitwuzla found 0 counterexamples. bv_decide only failed on 0 problems. bitwuzla only failed on 0 problems. both bitwuzla and bv_decide failed on 0 problems. In total, bitwuzla saw 0 problems. In total, bv_decide saw 0 problems. ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS The InstCombine benchmark contains 4520 theorems in total. Saved dataframe at: /code/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan mean of percentage stddev/av: nan%

Aug 29 '25 14:08 github-actions[bot]

bv_decide solved 0 theorems. bitwuzla solved 0 theorems. bv_decide found 0 counterexamples. bitwuzla found 0 counterexamples. bv_decide only failed on 0 problems. bitwuzla only failed on 0 problems. both bitwuzla and bv_decide failed on 0 problems. In total, bitwuzla saw 0 problems. In total, bv_decide saw 0 problems. ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS The InstCombine benchmark contains 4520 theorems in total. Saved dataframe at: /code/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan mean of percentage stddev/av: nan%

Aug 29 '25 14:08 github-actions[bot]

This breaks CI, but is fine for me. As I am currently on leave, maybe @bollu can review this. This patch seems to go in a direction he is also interested in.

Aug 30 '25 18:08 tobiasgrosser

Yes, happy to help here @alexkeizer , as this would help me create cleaner goal states as well

Aug 31 '25 10:08 bollu

@bollu With our fix this is now building properly, could you run the extraction scripts and update the goals files to make the extract goals CI happy?

Sep 01 '25 11:09 alexkeizer

bv_decide solved 0 theorems. bitwuzla solved 0 theorems. bv_decide found 0 counterexamples. bitwuzla found 0 counterexamples. bv_decide only failed on 0 problems. bitwuzla only failed on 0 problems. both bitwuzla and bv_decide failed on 0 problems. In total, bitwuzla saw 0 problems. In total, bv_decide saw 0 problems. ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS The InstCombine benchmark contains 4520 theorems in total. Saved dataframe at: /home/runner/work/lean-mlir/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan mean of percentage stddev/av: nan%

Sep 01 '25 12:09 github-actions[bot]

bv_decide solved 0 theorems. bitwuzla solved 0 theorems. bv_decide found 0 counterexamples. bitwuzla found 0 counterexamples. bv_decide only failed on 0 problems. bitwuzla only failed on 0 problems. both bitwuzla and bv_decide failed on 0 problems. In total, bitwuzla saw 0 problems. In total, bv_decide saw 0 problems. ran rg 'LeanSAT provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'Bitwuzla provided a counter' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'LeanSAT proved' | wc -l, this file found 0, rg found 0, SUCCESS ran rg 'Bitwuzla proved' | wc -l, this file found 0, rg found 0, SUCCESS The InstCombine benchmark contains 4520 theorems in total. Saved dataframe at: /code/lean-mlir/bv-evaluation/raw-data/InstCombine/instcombine_ceg_data.csv all_files_solved_bitwuzla_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_rw_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_bb_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_sat_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_lratt_times_stddev avg: nan | stddev: nan all_files_solved_bv_decide_lratc_times_stddev avg: nan | stddev: nan mean of percentage stddev/av: nan%

Sep 01 '25 12:09 github-actions[bot]