superlu
superlu copied to clipboard
Github workflow: Segmentation fault
I was skipping over the Checks tab of my recent pull request and in the Tests section I saw a couple of Segmentation fault (core dumped). This error is also present in all other pull requests running the workflow.
The first, obvious problem: a test that is producing an error should make the workflow check fail.
But I can also reproduce this error for example in dlinsolx on Windows
./dlinsolx -l 100000000 < ../../EXAMPLE/g20.rua
While this fails on Windows, on Arch linux the above command succeeds, but the test command fails with a segmentation fault:
./d_test -t "SP" -s 5 -l 100000000 -f ../../EXAMPLE/g20.rua
Can anybody reproduce this?
I looked into the configuration of the CMake tests, and besides being overly complex, they also seem to be fundamentally flawed (meaning they aren't testing anything).
First I added a simple check to runtest.cmake:
# execute the test command that was added earlier.
execute_process( COMMAND "${TEST}"
OUTPUT_FILE "${OUTPUT}"
RESULT_VARIABLE RET )
if(NOT RET EQUAL 0)
message("Error: ${RET}")
endif()
[...]
which prints Error: permission denied. This is because in TESTING/CMakeLists.txt the command set(TEST_LOC ${CMAKE_CURRENT_BINARY_DIR}) returns a directory and so
add_test( ${testName}_SP "${CMAKE_COMMAND}"
-DTEST=${TEST_LOC} -t "SP" -s ${s} -l ${l} -f ${TEST_INPUT}
[...]
will try execute the directory and not the actual test executable inside the directory. Simplifying add_test to
add_test(
NAME ${testName}_SP
COMMAND ${target} -t "SP" -s ${s} -l ${l} -f "${TEST_INPUT}")
then reveals the segfault:
Test project /projects/superlu/build/Testing
Start 1: s_test_9_2_0_LA
1/24 Test #1: s_test_9_2_0_LA .................. Passed 0.02 sec
Start 2: s_test_9_2_10000000_LA
2/24 Test #2: s_test_9_2_10000000_LA ........... Passed 0.02 sec
Start 3: s_test_19_2_0_LA
3/24 Test #3: s_test_19_2_0_LA ................. Passed 0.03 sec
Start 4: s_test_19_2_10000000_LA
4/24 Test #4: s_test_19_2_10000000_LA .......... Passed 0.03 sec
Start 5: s_test_2_0_SP
5/24 Test #5: s_test_2_0_SP .................... Passed 0.06 sec
Start 6: s_test_2_10000000_SP
6/24 Test #6: s_test_2_10000000_SP ............. Passed 0.07 sec
Start 7: d_test_9_2_0_LA
7/24 Test #7: d_test_9_2_0_LA .................. Passed 0.02 sec
Start 8: d_test_9_2_10000000_LA
8/24 Test #8: d_test_9_2_10000000_LA ........... Passed 0.02 sec
Start 9: d_test_19_2_0_LA
9/24 Test #9: d_test_19_2_0_LA ................. Passed 0.03 sec
Start 10: d_test_19_2_10000000_LA
10/24 Test #10: d_test_19_2_10000000_LA .......... Passed 0.03 sec
Start 11: d_test_2_0_SP
11/24 Test #11: d_test_2_0_SP .................... Passed 0.06 sec
Start 12: d_test_2_10000000_SP
12/24 Test #12: d_test_2_10000000_SP .............***Exception: SegFault 0.01 sec
Start 13: c_test_9_2_0_LA
13/24 Test #13: c_test_9_2_0_LA .................. Passed 0.02 sec
Start 14: c_test_9_2_10000000_LA
14/24 Test #14: c_test_9_2_10000000_LA ........... Passed 0.03 sec
Start 15: c_test_19_2_0_LA
15/24 Test #15: c_test_19_2_0_LA ................. Passed 0.06 sec
Start 16: c_test_19_2_10000000_LA
16/24 Test #16: c_test_19_2_10000000_LA .......... Passed 0.06 sec
Start 17: c_test_2_0_SP
17/24 Test #17: c_test_2_0_SP .................... Passed 0.12 sec
Start 18: c_test_2_10000000_SP
18/24 Test #18: c_test_2_10000000_SP .............***Exception: SegFault 0.01 sec
Start 19: z_test_9_2_0_LA
19/24 Test #19: z_test_9_2_0_LA .................. Passed 0.03 sec
Start 20: z_test_9_2_10000000_LA
20/24 Test #20: z_test_9_2_10000000_LA ........... Passed 0.03 sec
Start 21: z_test_19_2_0_LA
21/24 Test #21: z_test_19_2_0_LA ................. Passed 0.07 sec
Start 22: z_test_19_2_10000000_LA
22/24 Test #22: z_test_19_2_10000000_LA .......... Passed 0.07 sec
Start 23: z_test_2_0_SP
23/24 Test #23: z_test_2_0_SP .................... Passed 0.15 sec
Start 24: z_test_2_10000000_SP
24/24 Test #24: z_test_2_10000000_SP ............. Passed 0.16 sec
92% tests passed, 2 tests failed out of 24
Total Test time (real) = 1.23 sec
The following tests FAILED:
12 - d_test_2_10000000_SP (SEGFAULT)
18 - c_test_2_10000000_SP (SEGFAULT)
Errors while running CTest
Two more observations on the actual error:
The problem occurs both in debug and release mode and it doesn't seem to behave deterministic. While most of the time I get the segfault, sometimes the tests finish but produce garbage solutions:
[...]
dgssvx:fact= 3, trans= 1, equed=B, n=400, imat=0, test(1)= 6.2187e+09
dgssvx:fact= 3, trans= 1, equed=B, n=400, imat=0, test(2)= 3.0712e+10
dgssvx:fact= 3, trans= 1, equed=B, n=400, imat=0, test(4)= 3.8735e+09
dgssvx:fact= 3, trans= 0, equed=B, n=400, imat=0, test(1)= 1.9755e+14
dgssvx:fact= 3, trans= 0, equed=B, n=400, imat=0, test(2)= 5.4097e+13
dgssvx:fact= 3, trans= 0, equed=B, n=400, imat=0, test(4)= 2.2462e+13
dgssvx:fact= 3, trans= 1, equed=B, n=400, imat=0, test(1)= 6.2187e+09
dgssvx:fact= 3, trans= 1, equed=B, n=400, imat=0, test(2)= 3.0712e+10
dgssvx:fact= 3, trans= 1, equed=B, n=400, imat=0, test(4)= 3.8735e+09
DGE driver: 92 out of 144 tests failed to pass the threshold
EDIT:
To make ctest recognize this as a test failure, the drivers (cdrive.c etc.) should not return 0, but
return nfail == 0 ? EXIT_SUCCESS : EXIT_FAILURE;
I think it would be best to open 3 separate issues:
- The Github workflow should fail when a test fails (shouldn't matter if an actual test condition fails or a segfault occurs)
- The CMake test setup needs to be fixed (addressed in PR #112 )
- The actual cause of the segfault needs to be investigated
Here's what valgrind has to say about it:
/projects/superlu/build/TESTING$ valgrind ./d_test -t "SP" -s 5 -l 5000000 -f ../../EXAMPLE/g20.rua
==11462== Memcheck, a memory error detector
==11462== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==11462== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==11462== Command: ./d_test -t SP -s 5 -l 5000000 -f ../../EXAMPLE/g20.rua
==11462==
.. test sparse matrix in file: ../../EXAMPLE/g20.rua
g20, symm permuted by SYMMMD SYM
==11462== Conditional jump or move depends on uninitialised value(s)
==11462== at 0x12BCC2: relax_snode (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x11C973: dgstrf (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x10A1B2: main (in /projects/superlu/build/TESTING/d_test)
==11462==
==11462== Conditional jump or move depends on uninitialised value(s)
==11462== at 0x116E87: user_bcopy (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x12541C: dexpand (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x124EC2: dLUMemXpand (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x12300E: dcolumn_bmod (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x11CFA0: dgstrf (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x10A1B2: main (in /projects/superlu/build/TESTING/d_test)
==11462==
==11462== Use of uninitialised value of size 8
==11462== at 0x116E6C: user_bcopy (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x12541C: dexpand (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x124EC2: dLUMemXpand (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x12300E: dcolumn_bmod (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x11CFA0: dgstrf (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x10A1B2: main (in /projects/superlu/build/TESTING/d_test)
==11462==
==11462== Use of uninitialised value of size 8
==11462== at 0x116E73: user_bcopy (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x12541C: dexpand (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x124EC2: dLUMemXpand (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x12300E: dcolumn_bmod (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x11CFA0: dgstrf (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x10A1B2: main (in /projects/superlu/build/TESTING/d_test)
==11462==
==11462== Invalid read of size 1
==11462== at 0x116E6C: user_bcopy (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x12541C: dexpand (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x124EC2: dLUMemXpand (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x12300E: dcolumn_bmod (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x11CFA0: dgstrf (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x10A1B2: main (in /projects/superlu/build/TESTING/d_test)
==11462== Address 0x4f2503f is 1 bytes before a block of size 5,000,000 alloc'd
==11462== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==11462== by 0x116CAD: superlu_malloc (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x1094F3: main (in /projects/superlu/build/TESTING/d_test)
==11462==
==11462== Invalid write of size 1
==11462== at 0x116E73: user_bcopy (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x12541C: dexpand (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x124EC2: dLUMemXpand (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x12300E: dcolumn_bmod (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x11CFA0: dgstrf (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x10A1B2: main (in /projects/superlu/build/TESTING/d_test)
==11462== Address 0x4f2503f is 1 bytes before a block of size 5,000,000 alloc'd
==11462== at 0x48407B4: malloc (vg_replace_malloc.c:381)
==11462== by 0x116CAD: superlu_malloc (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x1094F3: main (in /projects/superlu/build/TESTING/d_test)
==11462==
==11462==
==11462== More than 10000000 total errors detected. I'm not reporting any more.
==11462== Final error counts will be inaccurate. Go fix your program!
==11462== Rerun with --error-limit=no to disable this cutoff. Note
==11462== that errors may occur in your program without prior warning from
==11462== Valgrind, because errors are no longer being displayed.
==11462==
==11462==
==11462== Process terminating with default action of signal 11 (SIGSEGV)
==11462== Bad permissions for mapped region at address 0x4B13FFF
==11462== at 0x116E73: user_bcopy (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x12541C: dexpand (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x124EC2: dLUMemXpand (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x12300E: dcolumn_bmod (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x11CFA0: dgstrf (in /projects/superlu/build/TESTING/d_test)
==11462== by 0x10A1B2: main (in /projects/superlu/build/TESTING/d_test)
==11462==
==11462== HEAP SUMMARY:
==11462== in use at exit: 5,169,304 bytes in 37 blocks
==11462== total heap usage: 46 allocs, 9 frees, 5,185,088 bytes allocated
==11462==
==11462== LEAK SUMMARY:
==11462== definitely lost: 0 bytes in 0 blocks
==11462== indirectly lost: 0 bytes in 0 blocks
==11462== possibly lost: 0 bytes in 0 blocks
==11462== still reachable: 5,169,304 bytes in 37 blocks
==11462== suppressed: 0 bytes in 0 blocks
==11462== Rerun with --leak-check=full to see details of leaked memory
==11462==
==11462== Use --track-origins=yes to see where uninitialised values come from
==11462== For lists of detected and suppressed errors, rerun with: -s
==11462== ERROR SUMMARY: 10000000 errors from 6 contexts (suppressed: 0 from 0)
So, the relevant part is
Invalid write of size 1
at 0x116E73: user_bcopy
by 0x12541C: dexpand
by 0x124EC2: dLUMemXpand
by 0x12300E: dcolumn_bmod
by 0x11CFA0: dgstrf
by 0x10A1B2: main
Address 0x4f2503f is 1 bytes before a block of size 5,000,000 alloc'd
So I think I tracked down the origin of the problem. Not sure what the correct fix would be, though.
In dmemory.c method dexpand
https://github.com/xiaoyeli/superlu/blob/29ea08a6deb67efc3be92068e60cb3605ef3f1fc/SRC/dmemory.c#L573-L577
at the time of calling, expanders[type + 1] is not initialized (where type = LUSUP).
That is due to
https://github.com/xiaoyeli/superlu/blob/29ea08a6deb67efc3be92068e60cb3605ef3f1fc/SRC/superlu_enum_consts.h#L37-L38
and dLUMemInit only initializing the first four positions of expanders
https://github.com/xiaoyeli/superlu/blob/29ea08a6deb67efc3be92068e60cb3605ef3f1fc/SRC/dmemory.c#L243-L250
EDIT:
I only debugged d_test. The same will most likely be the cause in c, s and z versions of the code.
I was wondering about the test type != USUB in
https://github.com/xiaoyeli/superlu/blob/29ea08a6deb67efc3be92068e60cb3605ef3f1fc/SRC/dmemory.c#L573-L577
Maybe the whole problem originates in a change of MemType made in
https://github.com/xiaoyeli/superlu/commit/52fc55d0397e382f46bdc4fb77445d0e2f4181ea#diff-4964d63c55baaf45c54e2d8b0485e230848b1a9da1d7c9fa40bdc3f77442c08d
Before that change the order was
typedef enum {LUSUP, UCOL, LSUB, USUB, LLVL, ULVL} MemType;
so testing for USUB would have been correct. But the order changed to
typedef enum {USUB, LSUB, UCOL, LUSUP, LLVL, ULVL, NO_MEMTYPE} MemType;
and maybe that was just missed in other places of the code, like dexpand.
So testing for type != LUSUP might be the correct fix. But I don't have enough insight into the SuperLU implementation details to be sure :-)
I just tested replacing type != USUB with type != LUSUP. Though this prevents the segfault, it does not prevent some of the tests to fail (assuming the return type fix of the test drivers mentioned in https://github.com/xiaoyeli/superlu/issues/108#issuecomment-1666602452 is applied).
The following tests FAILED:
6 - s_test_2_10000000_SP (Failed)
12 - d_test_2_10000000_SP (Failed)
18 - c_test_2_10000000_SP (Failed)
24 - z_test_2_10000000_SP (Failed)
I guess this is as far as I can go without digging into the memory management details of SuperLU.
I can reproduce the issue. Your analysis looks good, I am convinced that this was introduced by the commit you mentioned! Skimming through the commit, the changes to the enum are nowhere motivated and thus most probably wrong.
@xiaoyeli What do you think? Can we (partially) revert 52fc55d? Or do you know which pieces are needed from superlu_dist to fix these examples?
Resolved it in Master.
Resolved it in Master.
Alright. Now the remaining two issues mentioned above https://github.com/xiaoyeli/superlu/issues/108#issuecomment-1666605067 should be addressed.
Regarding the Github workflow: since the CMake build script works pretty well, I'd suggest installing cmake and then use it to build and test. Something along the lines (not tested)
- uses: actions/checkout@v3
- name: Configure
run: cmake -B build
- name: Build
run: cmake --build build --parallel
- name: Test
run: ctest --test-dir build --output-on-failure
Btw, if you look at the test output of the Github workflow, you still see a bunch of segfaults.
I strongly suggest that you fix the test setup, so the workflow reflects those problems.
I just tested the cmake workflow here https://github.com/wo80/superlu/commit/e494d2ac8c1bb17d475432273cdf8b60ba6f391a and all tests are passing.
@xiaoyeli Please let me know if you want me to merge this into #112
EDIT: I applied the change suggested in https://github.com/xiaoyeli/superlu/issues/108#issuecomment-1666602452 (see https://github.com/wo80/superlu/commit/53794fa76ae7f92c619f5b7940cc08ffa8daae1b) and this makes the tests fail. The segfault is also still present.
After merging the upstream changes, the segfault seems to be fixed. But now the "LA" d_tests fail with
Subprocess aborted***Exception: 0.15 sec
dgstrf info 1
dgstrf info 1
dgstrf info 19
double free or corruption (out)
see https://github.com/wo80/superlu/actions/runs/6146404541/job/16675708872
Failing tests:
7/24 Test #7: d_test_9_2_0_LA ..................Subprocess aborted***Exception
8/24 Test #8: d_test_19_2_0_LA .................Subprocess aborted***Exception
9/24 Test #9: d_test_2_0_SP ....................Passed
10/24 Test #10: d_test_9_2_10000000_LA ...........Subprocess aborted***Exception
11/24 Test #11: d_test_19_2_10000000_LA ..........Subprocess aborted***Exception
12/24 Test #12: d_test_2_10000000_SP .............Passed
Valgrind output:
valgrind --track-origins=yes --leak-check=full ./d_test -t "LA" -n 9 -s 2 -l 0
Memcheck, a memory error detector
Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
Command: ./d_test -t LA -n 9 -s 2 -l 0
dgstrf info 1
dgstrf info 1
dgstrf info 9
Invalid read of size 8
at 0x10C6BA: dgst01
by 0x10B4CA: main
Address 0x5158f78 is 8 bytes before a block of size 72 alloc'd
at 0x48407B4: malloc (vg_replace_malloc.c:381)
by 0x117DAA: superlu_malloc
by 0x126CBE: doubleCalloc
by 0x10C19D: dgst01
by 0x10B4CA: main
[...] more of those errors
dgstrf info 9
dgstrf info 5
Invalid read of size 4
at 0x10C382: dgst01
by 0x10B4CA: main
Address 0x516bc5c is 4 bytes before a block of size 40 alloc'd
at 0x48407B4: malloc (vg_replace_malloc.c:381)
by 0x117DAA: superlu_malloc
by 0x117FAE: int32Malloc
by 0x1258B5: dLUMemInit
by 0x11D857: dgstrf
by 0x118D40: dgssv
by 0x10B422: main
Invalid read of size 4
at 0x10C3FA: dgst01
by 0x10B4CA: main
Address 0x516c8bc is 4 bytes before a block of size 648 alloc'd
at 0x48407B4: malloc (vg_replace_malloc.c:381)
by 0x117DAA: superlu_malloc
by 0x1264D8: dexpand
by 0x125A0D: dLUMemInit
by 0x11D857: dgstrf
by 0x118D40: dgssv
by 0x10B422: main
[...] more of those errors
dgstrf info 5
All tests for DGE driver passed the threshold ( 1158 tests run)
HEAP SUMMARY:
in use at exit: 319,748 bytes in 578 blocks
total heap usage: 23,849 allocs, 23,271 frees, 11,336,936 bytes allocated
2,688 (960 direct, 1,728 indirect) bytes in 24 blocks are definitely lost in loss record 15 of 23
at 0x48407B4: malloc (vg_replace_malloc.c:381)
by 0x117DAA: superlu_malloc
by 0x118265: sp_preorder
by 0x119862: dgssvx
by 0x10B767: main
74,880 (1,408 direct, 73,472 indirect) bytes in 44 blocks are definitely lost in loss record 21 of 23
at 0x48407B4: malloc (vg_replace_malloc.c:381)
by 0x117DAA: superlu_malloc
by 0x126E18: dCreate_CompCol_Matrix
by 0x11E487: dgstrf
by 0x1198E5: dgssvx
by 0x10B767: main
81,216 (2,464 direct, 78,752 indirect) bytes in 44 blocks are definitely lost in loss record 22 of 23
at 0x48407B4: malloc (vg_replace_malloc.c:381)
by 0x117DAA: superlu_malloc
by 0x127347: dCreate_SuperNode_Matrix
by 0x11E44C: dgstrf
by 0x1198E5: dgssvx
by 0x10B767: main
LEAK SUMMARY:
definitely lost: 4,832 bytes in 112 blocks
indirectly lost: 153,952 bytes in 444 blocks
possibly lost: 0 bytes in 0 blocks
still reachable: 160,964 bytes in 22 blocks
suppressed: 0 bytes in 0 blocks
Reachable blocks (those to which a pointer was found) are not shown.
To see them, rerun with: --leak-check=full --show-leak-kinds=all
For lists of detected and suppressed errors, rerun with: -s
ERROR SUMMARY: 3883 errors from 27 contexts (suppressed: 0 from 0)
I think I found the culprit: https://github.com/xiaoyeli/superlu/commit/cf93b7e131d379774a52b184e23548d84eb66e30 https://github.com/xiaoyeli/superlu/blob/90ee45dc836d8f4ff967cad4aa2821809b12fdc9/SRC/dpivotL.c#L134-L146
This was an external contribution merged two days ago. And it's the perfect demonstration, how important a functional CI test setup is. So I'll quote myself from https://github.com/xiaoyeli/superlu/pull/112:
I think it's important to have tests reflecting reality and I think that this should be merged rather sooner than later (even if the issue remains unresolved for now).
I rebased #112 and added the changes from my fix/github-workflow branch (now deleted).
I haven't addressed the above issue in dpivotL.c. I think it's better you @xiaoyeli fix this in a single commit. Then the workflow shouldn't fail anymore.
EDIT: just for demonstration https://github.com/wo80/superlu/actions/runs/6149774863/job/16686331442
https://github.com/xiaoyeli/superlu/commit/f63265a50e6dec635c20f04f7b47e93b0a5c198b seems to fix the issue, the workflow tests are passing.
One question remaining, though:
Comparing dpivotL.c to the (c|s|z)pivotL.c variants, the code re-assigning perm_r[*pivrow] in the section labelled /* Test for singularity */ (see https://github.com/xiaoyeli/superlu/issues/108#issuecomment-1714265717 above) is disabled in the d variant, but not in the c|s|z variants. Which one is correct?
I think it would be best to open 3 separate issues:
- The Github workflow should fail when a test fails (shouldn't matter if an actual test condition fails or a segfault occurs)
- The CMake test setup needs to be fixed (addressed in PR https://github.com/xiaoyeli/superlu/pull/112 )
- The actual cause of the segfault needs to be investigated
- This is addressed in #131, some of your changes and some additions from myself.
- Fixed by your commits, merged as #114.
- Segfault is also fixed, the checks for #131 are passing.
I would like to extend this list: 4. We need a Windows runner, I created #132 for this to not extend this thread any longer. 5. Your last question should be answered, @xiaoyeli do you know the answer to this question?
Comparing
dpivotL.cto the(c|s|z)pivotL.cvariants, the code re-assigningperm_r[*pivrow]in the section labelled/* Test for singularity */(see https://github.com/xiaoyeli/superlu/issues/108#issuecomment-1714265717 above) is disabled in thedvariant, but not in thec|s|zvariants. Which one is correct?
I just pushed the fix to the following:
Comparing dpivotL.c to the (c|s|z)pivotL.c variants, the code re-assigning perm_r[pivrow] in the section labelled / Test for singularity */ (see https://github.com/xiaoyeli/superlu/issues/108#issuecomment-1714265717 above) is disabled in the d variant, but not in the c|s|z variants. Which one is correct?
@xiaoyeli This can be closed now. Thanks for the fix!