Some CI tests are flaky or real failures
There are some CI tests that seem to be flaky and fail.
- Failure in ctest.
Test project /home/runner/work/lpython/lpython
Start 1: test_stacktrace
1/2 Test #1: test_stacktrace .................. Passed 0.00 sec
Start 2: test_lpython
2/2 Test #2: test_lpython .....................Subprocess aborted***Exception: 0.86 sec
[doctest] doctest version is "2.4.8"
[doctest] run with "--help" for options
0 0 0 0 0 0 0 0 0 0
test_lpython: /home/runner/micromamba/envs/lp/include/llvm/IR/DataLayout.h:656: uint64_t llvm::StructLayout::getElementOffset(unsigned int) const: Assertion `Idx < NumElements && "Invalid element idx!"' failed.
===============================================================================
/home/runner/work/lpython/lpython/src/lpython/tests/test_llvm.cpp:[15](https://github.com/lcompilers/lpython/actions/runs/13378363981/job/37362241953?pr=2823#step:6:16)56:
TEST CASE: PythonCompiler classes
/home/runner/work/lpython/lpython/src/lpython/tests/test_llvm.cpp:1556: FATAL ERROR: test case CRASHED: SIGABRT - Abort (abnormal termination) signal
===============================================================================
[doctest] test cases: 55 | 54 passed | 1 failed | [17](https://github.com/lcompilers/lpython/actions/runs/13378363981/job/37362241953?pr=2823#step:6:18) skipped
[doctest] assertions: 486 | 486 passed | 0 failed |
[doctest] Status: FAILURE!
50% tests passed, 1 tests failed out of 2
Total Test time (real) = 0.87 sec
The following tests FAILED:
2 - test_lpython (Subprocess aborted)
Errors while running CTest
Error: Process completed with exit code 8.
- Failure in reference tests
compiler_tester.tester.RunException: Testing with reference output failed.
runtime_errors/test_assert_01.py * run_dbg
The JSON metadata differs against reference results
Reference JSON: tests/reference/run_dbg-test_assert_01-2f34744.json
Output JSON: tests/output/run_dbg-test_assert_01-2f34744.json
Omitting 9 identical items
Differing items:
{'stderr_hash': '32b0a24f111e577fe4fc5b3f4a5994b951e34dde7986b3fb750c5f5e'} != {'stderr_hash': '4811af471c73572b285e9ea01c8689abdd3cb32c717b3cd4876d2669'}
{'returncode': 134} != {'returncode': 1}
Diff against: tests/reference/run_dbg-test_assert_01-2f34744.stderr
1,7c1,2
< File "tests/runtime_errors/test_assert_01.py", line 1
< def test():
< File "tests/runtime_errors/test_assert_01.py", line 4
< test()
< File "tests/runtime_errors/test_assert_01.py", line 2
< assert False
< AssertionError
---
> *** buffer overflow detected ***: terminated
> Aborted (core dumped)
Error: Process completed with exit code 1.
The above tests kind of fail randomly. I will probably comment out these two tests for now and get the CI to pass (which will unblock merging PRs). We can then fix these tests iteratively in subsequent PRs.
Failure in reference tests
Seems like all the run_with_dbg reference tests are failing/flaky as of now. Following the steps at the CI, they work fine for me locally.
Also integration_tests/test_str_01.py fails with the above (> *** buffer overflow detected ***: terminated).
I think these are real failures that we need to fix.
I think a fix for this buffer overflow error might be the same as https://github.com/lfortran/lfortran/pull/6003.
The CI failures started occurring when the GCC compiler was updated on the CI. The PRs which fixed them incrementally were:
- https://github.com/lfortran/lfortran/pull/5983
- https://github.com/lfortran/lfortran/pull/6003
- https://github.com/lfortran/lfortran/pull/6004
https://github.com/lcompilers/lpython/blob/7eb2bea75234ee7a99158871175bc0bb7df63fb1/src/libasr/runtime/lfortran_intrinsics.c#L2306-L2308
It does look like the same issue for test_str_01