coq icon indicating copy to clipboard operation
coq copied to clipboard

[dune] [test-suite] Port the test suite to Dune

Open ejgallego opened this issue 4 years ago • 35 comments

This is an in-progress PR to port the test suite to Dune. Main design principles are:

  • use the standard runtest infrastructure, and declare all the dependencies correctly so the tests can be run incrementally
  • as we did for coq_dune , we have a small OCaml program to generate the rules, which we promote. At some point Dune will provide a better workflow for this.

TODO:

  • [x] parsing of headers with specific options
  • [ ] cram tests, (probably don't need any just yet)
  • [ ] log aggregation / generation / report
  • [ ] documentation
  • [ ] linting files (are all test files touched?)
  • [x] fix running test suite from clean build
    • Finding the META file is straight up broken without a full build (How can we depend on this?)
    • Currently many important deps are put in the rule generation rule. The test-suite should run independently of that, so the correct plugins need to be sprinkled as dependencies in various areas.
    • Output coqtop complains that the stdlib isn't available even though the -boot flag is passed.
  • [x] makefile
    • Makefile has test-suite target (together with test alias)
    • make test will start building all the tests
    • [x] promote target for updating all tests?

Note that this is a work-in-progress and significant changes and refactoring are expected, submitting for review and coordination.

Closes #15186 , closes #13328

Bugs to review: #6118 #5990 #5668

Test suite:

  • [x] bugs
  • [x] complexity
  • [x] coqchk
  • [x] coqdoc
  • [x] coq-makefile
  • [x] coqwc
  • [x] failure
  • [x] ide
  • [x] interactive
  • [x] ltac2
  • [x] micromega
  • [x] misc
  • [x] modules
  • [x] output
    • [x] mostly done
    • [ ] Fix Load.v which seems to be a coqdep bug
  • [x] output-coqchk
  • [x] output-coqtop
  • [x] output-failure
  • [x] output-modulo-time
  • [x] prerequisite
  • [x] primitive
  • [x] ssr
  • [x] stm
  • [x] success
  • [x] tools
  • [x] unit-tests
    • [x] mostly done
    • [ ] Fix proof_diffs_test - Something dramatic has changed with the tokenization
  • [x] vio

ejgallego avatar Nov 12 '20 20:11 ejgallego

@jfehrle I see the proof_diffs unit tests are failing, do you have an idea what's going on?

ejgallego avatar Nov 12 '20 20:11 ejgallego

cram tests?

These would be quite handy to understand & document the quirks of coqdoc, coqdep, coqc, etc.

rgrinberg avatar Nov 12 '20 21:11 rgrinberg

I see the proof_diffs unit tests are failing, do you have an idea what's going on?

What are you looking at? I don't see evidence of such a problem in test-suite:base. That step gets an unrelated fatal error.

If you see it again, would you do a run that preserves the test output?

The only thing I can think of that's distinctive about the diff tests is that some of them check for the color highlighting in the output, which includes escape characters.

jfehrle avatar Nov 13 '20 20:11 jfehrle

What are you looking at? I don't see evidence of such a problem in test-suite:base. That step gets an unrelated fatal error.

Likely because the ounit setup is buggy , even if the proof_diff test fails, it will exit with 0 O_O , quite a bug I'd say?

We have reproduced the failures independently:

TEST SUCCEEDED: diff_str empty
TEST SUCCEEDED: diff_str white space
TEST SUCCEEDED: diff_str add/remove
TEST FAILED: tokenize_string/diff_mode in lexer ('()' is a single token
expected: true but got: false)
TEST SUCCEEDED: shorten_diff_span failure from #8922
TEST SUCCEEDED: diff_pp/add_diff_tags add/remove
TEST SUCCEEDED: diff_pp/add_diff_tags a span with spaces
TEST SUCCEEDED: diff_pp/add_diff_tags diff tags outside existing tags
TEST SUCCEEDED: diff_pp/add_diff_tags existing tagged values with spaces
TEST SUCCEEDED: diff_pp/add_diff_tags multiple tokens in pp
TEST SUCCEEDED: diff_pp/add_diff_tags token spanning multiple Ppcmd_strs
TEST SUCCEEDED: diff_pp/add_diff_tags empty string preserved
TEST SUCCEEDED: diff_hyps simple diffs
TEST SUCCEEDED: diff_hyps compacted
TEST SUCCEEDED: diff_hyps compacted with join
TEST SUCCEEDED: diff_hyps compacted with split
*** Ran 16 tests, with 15 successes and 1 failures ***
==========> FAILURE <==========
    test-suite/unit-tests/printing/proof_diffs_test.ml...Error!

ejgallego avatar Nov 13 '20 20:11 ejgallego

The test works correctly when I run it from master locally using ounit2 2.2.3 (reported by opam) and ocaml 4.0.6 with make. When I force a test in that file to fail, it is reported correctly. For the specific failure you see, a print statement shows that () is indeed a single token. Very curious.

  assert_equal ~msg:"FORCE FAILURE" true false;

  List.iter (fun s ->
    Printf.printf "s is `%s`\n" s;
    assert_equal ~msg:("'" ^ s ^ "' is a single token") ~printer:string_of_bool true (List.mem s toks))
    [ "(*"; "()"; ":="]
TEST FAILED: diff_str empty (FORCE FAILURE
not equal)
TEST SUCCEEDED: diff_str white space
TEST SUCCEEDED: diff_str add/remove
TEST SUCCEEDED: tokenize_string/diff_mode in lexer
TEST SUCCEEDED: shorten_diff_span failure from #8922
TEST SUCCEEDED: diff_pp/add_diff_tags add/remove
TEST SUCCEEDED: diff_pp/add_diff_tags a span with spaces
TEST SUCCEEDED: diff_pp/add_diff_tags diff tags outside existing tags
TEST SUCCEEDED: diff_pp/add_diff_tags existing tagged values with spaces
TEST SUCCEEDED: diff_pp/add_diff_tags multiple tokens in pp
TEST SUCCEEDED: diff_pp/add_diff_tags token spanning multiple Ppcmd_strs
TEST SUCCEEDED: diff_pp/add_diff_tags empty string preserved
TEST SUCCEEDED: diff_hyps simple diffs
TEST SUCCEEDED: diff_hyps compacted
TEST SUCCEEDED: diff_hyps compacted with join
TEST SUCCEEDED: diff_hyps compacted with split
*** Ran 16 tests, with 15 successes and 1 failures ***
==========> FAILURE <==========
    unit-tests/printing/proof_diffs_test.ml...Error!
/home/proj/coq/test-suite$ make -j 12
TEST      unit-tests/printing/proof_diffs_test.ml
findlib: [WARNING] Package coq has multiple definitions in /home/proj/coq/META.coq, /home/jfehrle/.opam/4.06.0/lib/coq/META
FAILED    unit-tests/printing/proof_diffs_test.ml.log
s is `(*`
s is `()`
s is `:=`
make report
make[1]: Entering directory '/home/proj/coq/test-suite'
BUILDING SUMMARY FILE
FAILURES
    misc/printers.sh...Error!
    unit-tests/printing/proof_diffs_test.ml...Error!
To print details of failed tests, rerun with environment variable PRINT_LOGS=1
eg "make report PRINT_LOGS=1" from the test suite directory"
See README.md in the test suite directory for more information.
Makefile:216: recipe for target 'report' failed
make[1]: *** [report] Error 1
make[1]: Leaving directory '/home/proj/coq/test-suite'
Makefile:135: recipe for target 'all' failed
make: *** [all] Error 2

jfehrle avatar Nov 13 '20 21:11 jfehrle

Very interesting, could be due to the OCaml version? We'll research more.

ejgallego avatar Nov 13 '20 22:11 ejgallego

The "needs: rebase" label was set more than 30 days ago. If the PR is not rebased in 30 days, it will be automatically closed.

coqbot-app[bot] avatar Jul 28 '21 14:07 coqbot-app[bot]

This is not looking bad, current failures:

   coqc test-suite/modules/PO.v.log (exit 1)
(cd _build/default/test-suite/modules && ../../../install/default/bin/coqc -coqlib ../.. -R ../prerequisite TestSuite PO.v &> _build/default/test-suite/modules/PO.v.log
        coqc test-suite/modules/pliczek.v.log (exit 1)
(cd _build/default/test-suite/modules && ../../../install/default/bin/coqc -coqlib ../.. -R ../prerequisite TestSuite pliczek.v &> _build/default/test-suite/modules/pliczek.v.log
        coqc test-suite/bugs/opened/bug_3889.v.log (exit 129)
(cd _build/default/test-suite/bugs/opened && ../../../../install/default/bin/coqc -coqlib ../../.. -R ../../prerequisite TestSuite bug_3889.v &> _build/default/test-suite/bugs/opened/bug_3889.v.log

Interesting.

Also, the micromega tests are not compatible with the dune cache, as it forbids .csdp.cache to be writable. That's a minor problem I think.

We need to port some tests to the cram stuff tho.

ejgallego avatar Nov 08 '21 22:11 ejgallego

  • 3389 reports an anomaly which we don't accept
  • pliczek and PO.v have some special dep stuff

What to do with this PR is a good question, generating the dune file will be supported in dune 3.0, and in fact we can do better as we can delegate to dune running coqdep [which will make generation much faster], on the other hand this is quite useful already, but requires manual syncing of the generated files; maybe I will add temporary bootstrap until we can use dune 3.0

ejgallego avatar Nov 08 '21 22:11 ejgallego

I'm afraid I got very unclear feedback on what should it take to get this PR merged.

I'm thus halting work here until someone volunteers to shepherd this.

ejgallego avatar Nov 10 '21 17:11 ejgallego

My suggestion in the Coq Call was to get this as complete as possible, add cheat-sheet level documentation and then ask people to try it before its merged. My concern is that the PR will remove the current test-suite support and that omissions, bugs or lack of documentation in the replacement may significantly disrupt other developers.

jfehrle avatar Nov 10 '21 17:11 jfehrle

The "needs: rebase" label was set more than 30 days ago. If the PR is not rebased in 30 days, it will be automatically closed.

coqbot-app[bot] avatar Jan 14 '22 02:01 coqbot-app[bot]

Why do we have to generate the rules ourselves? This seems like a waste of time. Wouldn't it be better to just push for the feature of dune that allows us to do incremental tests (for coq files too)?

Alizter avatar Jan 20 '22 20:01 Alizter

Can we explicitly prevent coqdep from being called here? I don't know how easy it would be to get coqdep to accept this. cc @SkySkimmer who wrote the test AFAIK.

I guess we should fix coqdep, for now we can special case it indeed.

Why do we have to generate the rules ourselves? This seems like a waste of time. Wouldn't it be better to just push for the feature of dune that allows us to do incremental tests (for coq files too)?

Can you say more? We do need to generate the rules because otherwise how would Dune know how to do the tests?

ejgallego avatar Jan 21 '22 13:01 ejgallego

This PR was not rebased after 30 days despite the warning, it is now closed.

coqbot-app[bot] avatar Feb 14 '22 02:02 coqbot-app[bot]

@ejgallego Please resurrect this.

Alizter avatar Mar 01 '22 22:03 Alizter

I just submitted a big big deadline but have another large one in the horizon, can't promise anything after the 17th , feel free to take over, actually it is not so far from being ready.

ejgallego avatar Mar 01 '22 22:03 ejgallego

This PR was not rebased after 30 days despite the warning, it is now closed.

coqbot-app[bot] avatar Mar 02 '22 02:03 coqbot-app[bot]

Coqbot closed this again because it still has the "Stale" label. Clear that and it's good for 30 days.

jfehrle avatar Mar 02 '22 03:03 jfehrle

Or rebase, of course.

jfehrle avatar Mar 02 '22 03:03 jfehrle

I am currently rebasing, but it is not so easy, there are a lot of big changes that happened in the mean time. I am optimistic though.

Alizter avatar Mar 02 '22 10:03 Alizter

@Alizter I had started a rebase too before I saw your message, let me see if I have more luck.

ejgallego avatar Mar 02 '22 11:03 ejgallego

Be careful if you push to the branch, as we need to reopen this first, then push, otherwise github famously won't let us re-open.

ejgallego avatar Mar 02 '22 11:03 ejgallego

Ok, there are a couple of problems, but this seems to be OK for now.

ejgallego avatar Mar 02 '22 11:03 ejgallego

@ejgallego OK thanks. How do I fix the META stuff? I ended up changing back to "ltac_plugin" in my own attempts.

Alizter avatar Mar 02 '22 11:03 Alizter

otherwise github famously won't let us re-open.

I hope someone has reported this as a GitHub issue. It seems like something that's both impactful and likely easy to fix.

jfehrle avatar Mar 02 '22 17:03 jfehrle

It's not the end of the world to force push to a closed PR, there are workarounds: https://gist.github.com/robertpainsi/2c42c15f1ce6dab03a0675348edd4e2c GItHub just checks the commit hash from when the PR was closed.

Alizter avatar Mar 02 '22 21:03 Alizter

It's not the end of the world to force push to a closed PR, there are workarounds

It's great that there is a workaround, but if no one reports a problem then it will never get better. I make it a point to report bugs in any software I use. And I find simple problems are addressed fairly often. For example, several bugs I reported on Android (on my phone) were fixed, though it's surprising that any obvious bugs remain in a product that's used by 100s of millions of users.

jfehrle avatar Mar 02 '22 22:03 jfehrle

@jfehrle Sorry I should have also mentioned the opened issue: https://github.com/isaacs/github/issues/361. That issue has been open for 7 years and receives a lot of attention. It seems that the developers just don't care to fix this soon. So the workaround is the next best solution.

Alizter avatar Mar 02 '22 22:03 Alizter

The job library:ci-fiat_crypto_legacy has failed in allow failure mode ping @JasonGross

coqbot-app[bot] avatar Mar 18 '22 17:03 coqbot-app[bot]