enso icon indicating copy to clipboard operation
enso copied to clipboard

Verify benchmarks compile and execute in the gate

Open JaroslavTulach opened this issue 3 years ago • 5 comments

[ci no changelog needed]

Pull Request Description

Execution of sbt runtime/bench doesn't seem to be part of the gate. As such it can happen a change into the Enso language syntax, standard libraries, runtime & co. can break the benchmarks suite without being noticed. Integrating such PR causes unnecessary disruptions to others using the benchmarks.

Let's make sure verification of the benchmarks (e.g. that they compile and can execute without error) is part of the CI.

Important Notes

Currently the gate shall fail. The fix is being prepared in parallel PR - #3639. When the two PRs are combined, the gate shall succeed again.

Checklist

Please include the following checklist in your PR:

  • All code has been tested:
    • [ x ] This PR makes sure sbt runtime/bench test is being executed

JaroslavTulach avatar Aug 10 '22 03:08 JaroslavTulach

Hello Michal,

I think I'll need to integrate this into the build script.

Yes, having benchmarks run by the official build script would be good. On the other hand: we need to have a safety net to warn us that benchmarks are broken in the CI. We don't need to do the patching locally. When it fails, one can run just the single benchmark with benchOnly *AtomBenchmarks* command anyway.

Btw. I've noticed there is benchmark.yml - why it is not being executed? If it was, it could catch the https://github.com/enso-org/enso/pull/3639 problem...

The approach of "patch sources by sed, run, delete results" feels really hacky,

Right. It is.

not to mention the Windows issues.

I've told the actions to run just on Ubuntu. I need to verify the benchmarks are without error - enough to do that on one system, no need to have the CI running on Windows & co.

Would it be possible to introduce e.g. ENSO_BENCHMARK_ITERATION_LIMIT environment variable and handle it on the side of benchmarking code?

Update: At the end ac2b589 manages to do the change with -Dbench.compileOnly=true Java System variable.

Also, why are you deleting results after run?

The results are supposed to be deleted before running the benchmarks. I don't need performance comparison, I just want to verify the benchmarks can execute without problems - but not fast or slow. Having the file bench-report.xml around (for any reason) would compare the new results against the previous one. That's why I want to make sure the file is gone.

Feel free to integrate your changes into this PR directly.

JaroslavTulach avatar Aug 10 '22 03:08 JaroslavTulach

I see Verify Benchmarks failed with:

[info] *** 23 TESTS FAILED ***
[error] Failed tests:
[error] 	org.enso.interpreter.bench.RegressionTest
[error] (runtime / Bench / test) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 225 s (03:45), completed Aug 10, 2022, 5:01:33 AM

Error: Process completed with exit code 1.

e.g. the check does exactly what we want! Once #3639 is integrated by @4e6, we can integrate this one and have a verification of our benchmarks.

JaroslavTulach avatar Aug 10 '22 05:08 JaroslavTulach

The Verify Benchmarks run isn't yet "Required". What shall be done, @mwu-tow, to make it "Required"?

JaroslavTulach avatar Aug 10 '22 05:08 JaroslavTulach

Latest Verify Benchmarks run continues to fail. Great! Let's wait for #3639 to get integrated and then try again.

JaroslavTulach avatar Aug 10 '22 11:08 JaroslavTulach

OK, #3639 is merged. Let's re-run the Verify benchmarks gate.

JaroslavTulach avatar Aug 10 '22 13:08 JaroslavTulach

Was the verify benchmarks check started at all after e4f2769? Where can I find it among the checks?

JaroslavTulach avatar Aug 11 '22 03:08 JaroslavTulach

Was the verify benchmarks check started at all after https://github.com/enso-org/enso/commit/e4f2769cd82f3f7f97562a8c9ac94ae5ff87d648? Where can I find it among the checks?

No, the workflow was removed. I ended up integrating the check into other engine checks within Engine CI workflow.

mwu-tow avatar Aug 11 '22 04:08 mwu-tow

Benchmarks are being run with -Dbench.compileOnly=true under backend ci-check command (which is what Engine CI does). This makes most sense to me: that job already did build benchmarks and executing them once adds very little overhead. Both CI check and full-blown Benchmark workflows are now uploading benchmark report as CI artifact.

mwu-tow avatar Aug 11 '22 06:08 mwu-tow