coq
coq copied to clipboard
[dune] [test-suite] Port the test suite to Dune
This is an in-progress PR to port the test suite to Dune. Main design principles are:
- use the standard
runtestinfrastructure, and declare all the dependencies correctly so the tests can be run incrementally - as we did for
coq_dune, we have a small OCaml program to generate the rules, which we promote. At some point Dune will provide a better workflow for this.
TODO:
- [x] parsing of headers with specific options
- [ ] cram tests, (probably don't need any just yet)
- [ ] log aggregation / generation / report
- [ ] documentation
- [ ] linting files (are all test files touched?)
- [x] fix running test suite from clean build
Finding the META file is straight up broken without a full build (How can we depend on this?)Currently many important deps are put in the rule generation rule. The test-suite should run independently of that, so the correct plugins need to be sprinkled as dependencies in various areas.Output coqtop complains that the stdlib isn't available even though the -boot flag is passed.
- [x] makefile
- Makefile has test-suite target (together with test alias)
make testwill start building all the tests- [x] promote target for updating all tests?
Note that this is a work-in-progress and significant changes and refactoring are expected, submitting for review and coordination.
Closes #15186 , closes #13328
Bugs to review: #6118 #5990 #5668
Test suite:
- [x] bugs
- [x] complexity
- [x] coqchk
- [x] coqdoc
- [x] coq-makefile
- [x] coqwc
- [x] failure
- [x] ide
- [x] interactive
- [x] ltac2
- [x] micromega
- [x] misc
- [x] modules
- [x] output
- [x] mostly done
- [ ] Fix Load.v which seems to be a coqdep bug
- [x] output-coqchk
- [x] output-coqtop
- [x] output-failure
- [x] output-modulo-time
- [x] prerequisite
- [x] primitive
- [x] ssr
- [x] stm
- [x] success
- [x] tools
- [x] unit-tests
- [x] mostly done
- [ ] Fix proof_diffs_test - Something dramatic has changed with the tokenization
- [x] vio
@jfehrle I see the proof_diffs unit tests are failing, do you have an idea what's going on?
cram tests?
These would be quite handy to understand & document the quirks of coqdoc, coqdep, coqc, etc.
I see the proof_diffs unit tests are failing, do you have an idea what's going on?
What are you looking at? I don't see evidence of such a problem in test-suite:base. That step gets an unrelated fatal error.
If you see it again, would you do a run that preserves the test output?
The only thing I can think of that's distinctive about the diff tests is that some of them check for the color highlighting in the output, which includes escape characters.
What are you looking at? I don't see evidence of such a problem in
test-suite:base. That step gets an unrelated fatal error.
Likely because the ounit setup is buggy , even if the proof_diff test fails, it will exit with 0 O_O , quite a bug I'd say?
We have reproduced the failures independently:
TEST SUCCEEDED: diff_str empty
TEST SUCCEEDED: diff_str white space
TEST SUCCEEDED: diff_str add/remove
TEST FAILED: tokenize_string/diff_mode in lexer ('()' is a single token
expected: true but got: false)
TEST SUCCEEDED: shorten_diff_span failure from #8922
TEST SUCCEEDED: diff_pp/add_diff_tags add/remove
TEST SUCCEEDED: diff_pp/add_diff_tags a span with spaces
TEST SUCCEEDED: diff_pp/add_diff_tags diff tags outside existing tags
TEST SUCCEEDED: diff_pp/add_diff_tags existing tagged values with spaces
TEST SUCCEEDED: diff_pp/add_diff_tags multiple tokens in pp
TEST SUCCEEDED: diff_pp/add_diff_tags token spanning multiple Ppcmd_strs
TEST SUCCEEDED: diff_pp/add_diff_tags empty string preserved
TEST SUCCEEDED: diff_hyps simple diffs
TEST SUCCEEDED: diff_hyps compacted
TEST SUCCEEDED: diff_hyps compacted with join
TEST SUCCEEDED: diff_hyps compacted with split
*** Ran 16 tests, with 15 successes and 1 failures ***
==========> FAILURE <==========
test-suite/unit-tests/printing/proof_diffs_test.ml...Error!
The test works correctly when I run it from master locally using ounit2 2.2.3 (reported by opam) and ocaml 4.0.6 with make. When I force a test in that file to fail, it is reported correctly. For the specific failure you see, a print statement shows that () is indeed a single token. Very curious.
assert_equal ~msg:"FORCE FAILURE" true false;
List.iter (fun s ->
Printf.printf "s is `%s`\n" s;
assert_equal ~msg:("'" ^ s ^ "' is a single token") ~printer:string_of_bool true (List.mem s toks))
[ "(*"; "()"; ":="]
TEST FAILED: diff_str empty (FORCE FAILURE
not equal)
TEST SUCCEEDED: diff_str white space
TEST SUCCEEDED: diff_str add/remove
TEST SUCCEEDED: tokenize_string/diff_mode in lexer
TEST SUCCEEDED: shorten_diff_span failure from #8922
TEST SUCCEEDED: diff_pp/add_diff_tags add/remove
TEST SUCCEEDED: diff_pp/add_diff_tags a span with spaces
TEST SUCCEEDED: diff_pp/add_diff_tags diff tags outside existing tags
TEST SUCCEEDED: diff_pp/add_diff_tags existing tagged values with spaces
TEST SUCCEEDED: diff_pp/add_diff_tags multiple tokens in pp
TEST SUCCEEDED: diff_pp/add_diff_tags token spanning multiple Ppcmd_strs
TEST SUCCEEDED: diff_pp/add_diff_tags empty string preserved
TEST SUCCEEDED: diff_hyps simple diffs
TEST SUCCEEDED: diff_hyps compacted
TEST SUCCEEDED: diff_hyps compacted with join
TEST SUCCEEDED: diff_hyps compacted with split
*** Ran 16 tests, with 15 successes and 1 failures ***
==========> FAILURE <==========
unit-tests/printing/proof_diffs_test.ml...Error!
/home/proj/coq/test-suite$ make -j 12
TEST unit-tests/printing/proof_diffs_test.ml
findlib: [WARNING] Package coq has multiple definitions in /home/proj/coq/META.coq, /home/jfehrle/.opam/4.06.0/lib/coq/META
FAILED unit-tests/printing/proof_diffs_test.ml.log
s is `(*`
s is `()`
s is `:=`
make report
make[1]: Entering directory '/home/proj/coq/test-suite'
BUILDING SUMMARY FILE
FAILURES
misc/printers.sh...Error!
unit-tests/printing/proof_diffs_test.ml...Error!
To print details of failed tests, rerun with environment variable PRINT_LOGS=1
eg "make report PRINT_LOGS=1" from the test suite directory"
See README.md in the test suite directory for more information.
Makefile:216: recipe for target 'report' failed
make[1]: *** [report] Error 1
make[1]: Leaving directory '/home/proj/coq/test-suite'
Makefile:135: recipe for target 'all' failed
make: *** [all] Error 2
Very interesting, could be due to the OCaml version? We'll research more.
The "needs: rebase" label was set more than 30 days ago. If the PR is not rebased in 30 days, it will be automatically closed.
This is not looking bad, current failures:
coqc test-suite/modules/PO.v.log (exit 1)
(cd _build/default/test-suite/modules && ../../../install/default/bin/coqc -coqlib ../.. -R ../prerequisite TestSuite PO.v &> _build/default/test-suite/modules/PO.v.log
coqc test-suite/modules/pliczek.v.log (exit 1)
(cd _build/default/test-suite/modules && ../../../install/default/bin/coqc -coqlib ../.. -R ../prerequisite TestSuite pliczek.v &> _build/default/test-suite/modules/pliczek.v.log
coqc test-suite/bugs/opened/bug_3889.v.log (exit 129)
(cd _build/default/test-suite/bugs/opened && ../../../../install/default/bin/coqc -coqlib ../../.. -R ../../prerequisite TestSuite bug_3889.v &> _build/default/test-suite/bugs/opened/bug_3889.v.log
Interesting.
Also, the micromega tests are not compatible with the dune cache, as it forbids .csdp.cache to be writable. That's a minor problem I think.
We need to port some tests to the cram stuff tho.
- 3389 reports an anomaly which we don't accept
- pliczek and PO.v have some special dep stuff
What to do with this PR is a good question, generating the dune file will be supported in dune 3.0, and in fact we can do better as we can delegate to dune running coqdep [which will make generation much faster], on the other hand this is quite useful already, but requires manual syncing of the generated files; maybe I will add temporary bootstrap until we can use dune 3.0
I'm afraid I got very unclear feedback on what should it take to get this PR merged.
I'm thus halting work here until someone volunteers to shepherd this.
My suggestion in the Coq Call was to get this as complete as possible, add cheat-sheet level documentation and then ask people to try it before its merged. My concern is that the PR will remove the current test-suite support and that omissions, bugs or lack of documentation in the replacement may significantly disrupt other developers.
The "needs: rebase" label was set more than 30 days ago. If the PR is not rebased in 30 days, it will be automatically closed.
Why do we have to generate the rules ourselves? This seems like a waste of time. Wouldn't it be better to just push for the feature of dune that allows us to do incremental tests (for coq files too)?
Can we explicitly prevent coqdep from being called here? I don't know how easy it would be to get coqdep to accept this. cc @SkySkimmer who wrote the test AFAIK.
I guess we should fix coqdep, for now we can special case it indeed.
Why do we have to generate the rules ourselves? This seems like a waste of time. Wouldn't it be better to just push for the feature of dune that allows us to do incremental tests (for coq files too)?
Can you say more? We do need to generate the rules because otherwise how would Dune know how to do the tests?
This PR was not rebased after 30 days despite the warning, it is now closed.
@ejgallego Please resurrect this.
I just submitted a big big deadline but have another large one in the horizon, can't promise anything after the 17th , feel free to take over, actually it is not so far from being ready.
This PR was not rebased after 30 days despite the warning, it is now closed.
Coqbot closed this again because it still has the "Stale" label. Clear that and it's good for 30 days.
Or rebase, of course.
I am currently rebasing, but it is not so easy, there are a lot of big changes that happened in the mean time. I am optimistic though.
@Alizter I had started a rebase too before I saw your message, let me see if I have more luck.
Be careful if you push to the branch, as we need to reopen this first, then push, otherwise github famously won't let us re-open.
Ok, there are a couple of problems, but this seems to be OK for now.
@ejgallego OK thanks. How do I fix the META stuff? I ended up changing back to "ltac_plugin" in my own attempts.
otherwise github famously won't let us re-open.
I hope someone has reported this as a GitHub issue. It seems like something that's both impactful and likely easy to fix.
It's not the end of the world to force push to a closed PR, there are workarounds: https://gist.github.com/robertpainsi/2c42c15f1ce6dab03a0675348edd4e2c GItHub just checks the commit hash from when the PR was closed.
It's not the end of the world to force push to a closed PR, there are workarounds
It's great that there is a workaround, but if no one reports a problem then it will never get better. I make it a point to report bugs in any software I use. And I find simple problems are addressed fairly often. For example, several bugs I reported on Android (on my phone) were fixed, though it's surprising that any obvious bugs remain in a product that's used by 100s of millions of users.
@jfehrle Sorry I should have also mentioned the opened issue: https://github.com/isaacs/github/issues/361. That issue has been open for 7 years and receives a lot of attention. It seems that the developers just don't care to fix this soon. So the workaround is the next best solution.
The job library:ci-fiat_crypto_legacy has failed in allow failure mode ping @JasonGross