M2 icon indicating copy to clipboard operation
M2 copied to clipboard

Segfault in pseudoWitnessSet calling PathTracker

Open mahrud opened this issue 5 years ago • 13 comments

This was in an example for isOnImage in NumericalImplicitization:

-- -*- M2-comint -*- hash: 927871285

i1 : R = CC[x_(1,1)..x_(3,5)]; I = ideal 0_R;

o2 : Ideal of R

i3 : F = (minors(3, genericMatrix(R, 3, 5)))_*;

i4 : W = pseudoWitnessSet(F, I, Repeats => 2, Verbose => false);
-- SIGSEGV
-* stack trace, pid: 11133
 0# stack_trace(std::ostream&, bool) at ../../Macaulay2/bin/main.cpp:124
 1# segv_handler at ../../Macaulay2/bin/main.cpp:240
 2# 0x00007F2A91E38FD0 in /lib/x86_64-linux-gnu/libc.so.6
 3# SLP<ComplexField>::concatenate(SLP<ComplexField> const*) at ../../Macaulay2/e/NAG.cpp:382
 4# PathTracker::make(Matrix const*) at ../../Macaulay2/e/NAG.cpp:1537
 5# interface2_rawPathTracker at /home/runner/work/M2/M2/M2/Macaulay2/d/interface2.d:358
 6# evaluate_evalraw at /home/runner/work/M2/M2/M2/Macaulay2/d/evaluate.d:1293

Full error: _is__On__Image.errors.txt

mahrud avatar Aug 05 '20 22:08 mahrud

This is probably extremely rare, but I thought I'd leave it here.

mahrud avatar Aug 05 '20 22:08 mahrud

This happened again here.

-- -*- M2-comint -*- hash: 782969290

i1 : R = CC[x_(1,1)..x_(3,5)]; I = ideal 0_R;

o2 : Ideal of R

i3 : F = (minors(3, genericMatrix(R, 3, 5)))_*;

i4 : numericalImageDegree(F, I, Repeats => 2, Verbose => false)
-- SIGSEGV
-* stack trace, pid: 13330
 0# stack_trace(std::ostream&, bool) at ../../Macaulay2/bin/main.cpp:124
 1# segv_handler at ../../Macaulay2/bin/main.cpp:240
 2# 0x00007F3560CCAFD0 in /lib/x86_64-linux-gnu/libc.so.6
 3# SLP<ComplexField>::concatenate(SLP<ComplexField> const*) at ../../Macaulay2/e/NAG.cpp:382
 4# PathTracker::make(Matrix const*) at ../../Macaulay2/e/NAG.cpp:1538
 5# interface2_rawPathTracker at /home/runner/work/M2/M2/M2/Macaulay2/d/interface2.d:359
 6# evaluate_evalraw at /home/runner/work/M2/M2/M2/Macaulay2/d/evaluate.d:1293

Full error: _numerical__Image__Degree.errors.txt

mahrud avatar Aug 21 '20 05:08 mahrud

@antonleykin have you seen this?

mahrud avatar Aug 21 '20 05:08 mahrud

I tried manually running the first example in a debug build and got the a warning message, and then induced a stack trace, which is a little bit different from the one above:

Warning: please export TSAN_OPTIONS='ignore_noninstrumented_modules=1' to avoid false positive reports from the OpenMP runtime.!
^C^C
Exit (y=yes/n=no/a=abort/b=backtrace)? b
-* stack trace, pid: 1690390
 0# std::vector<boost::stacktrace::frame, std::allocator<boost::stacktrace::frame> >::size() const at /usr/bin/../lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/stl_vector.h:919
 1# interrupt_handler at /home/mahrud/Projects/M2/M2/M2/BUILD/build/../../Macaulay2/bin/main.cpp:?
 2# 0x00007FCED0810A90 in /lib64/libpthread.so.0
 3# 0x00007FCED080952D in /lib64/libpthread.so.0
 4# zgetrf_parallel in /lib64/libopenblasp.so.0
 5# zgetrf_parallel in /lib64/libopenblasp.so.0
 6# zgesv_ in /lib64/libopenblasp.so.0
 7# solve_via_lapack_without_transposition(int, complex*, int, complex*, complex*) at /home/mahrud/Projects/M2/M2/M2/BUILD/build/../../Macaulay2/e/NAG.cpp:1233
 8# PathTracker::track(Matrix const*) at /home/mahrud/Projects/M2/M2/M2/BUILD/build/../../Macaulay2/e/NAG.cpp:?
 9# interface2_rawLaunchPT at /home/mahrud/Projects/M2/M2/M2/Macaulay2/d/interface2.d:448
10# evaluate_evalraw at /home/mahrud/Projects/M2/M2/M2/Macaulay2/d/evaluate.d:?

Where is that warning coming from?

mahrud avatar Sep 19 '20 06:09 mahrud

Just got a similar segfault in an s390x PPA build on Ubuntu 21.10:

 -- capturing check(3, "NumericalImplicitization")                          -- SIGSEGV
-* stack trace, pid: 22806
 0# stack_trace(std::ostream&, bool) at ./M2/Macaulay2/d/main.cpp:127
 1# segv_handler at ./M2/Macaulay2/d/main.cpp:243
 2# 0x000003FF88574B9E
 3# memcpy in /lib/s390x-linux-gnu/libc.so.6
 4# SLP<ComplexField>::concatenate(SLP<ComplexField> const*) at ./M2/Macaulay2/e/NAG.cpp:393
 5# PathTracker::make(Matrix const*) at ./M2/Macaulay2/e/NAG.cpp:1545
 6# interface2_rawPathTracker at ./M2/Macaulay2/d/interface2.d:364
 7# evaluate_evalraw at ./M2/Macaulay2/d/evaluate.d:1297
 8# evaluate_evalraw at ./M2/Macaulay2/d/evaluate.d:1358
 9# evaluate_evalraw at ./M2/Macaulay2/d/evaluate.d:1357

d-torrance avatar Jun 23 '21 02:06 d-torrance

I couldn't reproduce this. @jchen419, would you know how to isolate this issue?

antonleykin avatar Jun 25 '21 18:06 antonleykin

Perhaps the changes in https://github.com/Macaulay2/M2/pull/2214 affect SVD (and therefore this issue)? @jchen419, is this easy to check?

antonleykin avatar Aug 30 '21 18:08 antonleykin

@antonleykin That's an interesting point, whether some of these crashes are fixed by #2214. Indeed, I was able to reproduce the error shown by Doug on 1.17:

i2 : check_3 NumericalImplicitization
 -- capturing check(3, "NumericalImplicitization")                          -- warning: experimental computation over inexact field begun
--          results not reliable (one warning given per session)
 -- 4.95503 seconds elapsed

i3 : check_3 NumericalImplicitization
 -- capturing check(3, "NumericalImplicitization")                          -- SIGSEGV
-* stack trace, pid: 7776
 0# 0x0000561057C20AF7 in /usr/bin/M2-binary
 1# 0x0000561057C20DA0 in /usr/bin/M2-binary
 2# 0x00007FC21CE8F040 in /usr/lib/x86_64-linux-gnu/libc.so.6
 3# 0x00007FC21CFD0FA6 in /usr/lib/x86_64-linux-gnu/libc.so.6

I then ran this with the changes for #2214, and indeed no errors appeared. However, I was not able to reproduce the crash again on 1.17 (even repeating the check 10 times). It may be that the conditions which cause the segfault are quite rare (and random).

jchen419 avatar Aug 31 '21 13:08 jchen419

It may be that the conditions which cause the segfault are quite rare (and random).

Yeah, I think this is true. I've only seen this happen one time in many, many of builds of the Debian package.

d-torrance avatar Aug 31 '21 14:08 d-torrance

This showed up again in an s390x PPA build in Ubuntu 21.10. I think this is the first time I've seen it since #2214 was merged.

 -- running   check(41, "NumericalAlgebraicGeometry")                        -- 1.79995 seconds elapsed
NumericalAlgebraicGeometry.m2:541:1-544:1: error:
 -- Setup time: 0
 -- Computing time:0
 -- Setup time: 0
 -- Computing time:0
 -- Setup time: 0
 -- Computing time:0
 -- ...M.MM.M.MMMM..MM.MM.......MMM...MMM.M..M....M....MM....M.....M..MM.M..M.M.M.M..MMM.M.MM.M...M..MM.
 -- M.....MM..M..M...MM
 -- 
 -- ....-- SIGSEGV
 -- 
 -- -* stack trace, pid: 47529
 --  0# stack_trace(std::ostream&, bool) at ./M2/Macaulay2/d/main.cpp:127
 --  1# segv_handler at ./M2/Macaulay2/d/main.cpp:243
 --  2# 0x000003FF78279CBE
 --  3# memcpy in /lib/s390x-linux-gnu/libc.so.6
 --  4# SLP<ComplexField>::concatenate(SLP<ComplexField> const*) at ./M2/Macaulay2/e/NAG.cpp:393
 --  5# PathTracker::make(Matrix const*, Matrix const*, __mpfr_struct const*) at ./M2/Macaulay2/e/NAG.cpp:1480
 --  6# interface2_rawPathTrackerProjective at ./M2/Macaulay2/d/interface2.d:384
 --  7# evaluate_evalraw at ./M2/Macaulay2/d/evaluate.d:1303
 --  8# evaluate_evalraw at ./M2/Macaulay2/d/evaluate.d:1363
 --  9# evaluate_evalraw at ./M2/Macaulay2/d/evaluate.d:1363
```

d-torrance avatar Mar 24 '22 15:03 d-torrance

This showed up again in an s390x PPA build in Ubuntu 21.10. I think this is the first time I've seen it since #2214 was merged.

@d-torrance, I'm still struggling to reproduce this (last attempt: on Ubuntu 20.04, under gdb). Do you have any recent sightings of this beast?

antonleykin avatar Apr 29 '22 15:04 antonleykin

Not since that one on March 24. It's very, very rare!

d-torrance avatar Apr 29 '22 15:04 d-torrance

Not sure if this is the same bug or not. From an armel Debian build of the 1.20 package:

 -- capturing check(1, "NumericalImplicitization")                          -- warning: experimental computation over inexact field begun
--          results not reliable (one warning given per session)
-- SIGSEGV
-* stack trace, pid: 10605
 0# stack_trace(std::ostream&, bool) at ./M2/Macaulay2/d/main.cpp:127
 1# segv_handler at ./M2/Macaulay2/d/main.cpp:243
 2# __default_sa_restorer in /lib/arm-linux-gnueabi/libc.so.6
-- end stack trace *-
make[4]: *** [Makefile:50: check-NumericalImplicitization] Error 1

d-torrance avatar May 17 '22 14:05 d-torrance

This popped up again on an arm64 PPA build for Ubuntu 22.04:

 -- running   check(34, "NumericalAlgebraicGeometry")                       
 cd /tmp/M2-46244-0/12-rundir/; GC_MAXIMUM_HEAP_SIZE=400M "/<<PKGBUILDDIR>>/M2/usr-dist/aarch64-Linux-Ubuntu-22.04/bin/M2-binary" -q --no-randomize --no-readline --silent --stop --print-width 77 -e 'needsPackage("NumericalAlgebraicGeometry",Reload=>true,FileName=>"/<<PKGBUILDDIR>>/M2/Macaulay2/packages/NumericalAlgebraicGeometry.m2")' <"/tmp/M2-46244-0/11.m2" >>"/tmp/M2-46244-0/11.tmp" 2>&1
/tmp/M2-46244-0/11.tmp:0:1: (output file) error: Macaulay2 exited with status code 1
/tmp/M2-46244-0/11.m2:0:1: (input file)
M2: *** Error 1
 -- 6.58524 seconds elapsed
 -- running   check(35, "NumericalAlgebraicGeometry")                        -- 6.37817 seconds elapsed
 -- capturing check(36, "NumericalAlgebraicGeometry")                        -- 0.185421 seconds elapsed
 -- capturing check(37, "NumericalAlgebraicGeometry")                        -- 0.554 seconds elapsed
 -- capturing check(38, "NumericalAlgebraicGeometry")                        -- 0.230066 seconds elapsed
 -- capturing check(39, "NumericalAlgebraicGeometry")                        -- 0.155352 seconds elapsed
 -- running   check(40, "NumericalAlgebraicGeometry")                        -- 5.74298 seconds elapsed
 -- running   check(41, "NumericalAlgebraicGeometry")                        -- 5.87371 seconds elapsed
NumericalAlgebraicGeometry.m2:541:1-544:1: error:
 -- Setup time: 0
 -- Computing time:0
 -- Setup time: 0
 -- Computing time:0
 -- ...M.MM.M.MMMM..MM.MM.......MMM...MMM.M..M....M....MM....M.....M..MM.M..M.M.M.M..MMM.M.MM.M...M..MM.
 -- M.....MM..M..M...MM
 -- 
 -- ....
 -- 
 -- ....-- SIGSEGV
 -- 
 -- -* stack trace, pid: 46342
 --  0# stack_trace(std::ostream&, bool) at ./M2/Macaulay2/d/main.cpp:127
 --  1# segv_handler at ./M2/Macaulay2/d/main.cpp:243
 --  2# 0x0000FFFFBD4545C0 in linux-vdso.so.1
 --  3# 0x0000FFFFBB067BDC in /lib/aarch64-linux-gnu/libc.so.6
 --  4# SLP<ComplexField>::concatenate(SLP<ComplexField> const*) at ./M2/Macaulay2/e/NAG.cpp:393
 --  5# PathTracker::make(Matrix const*, Matrix const*, __mpfr_struct const*) at ./M2/Macaulay2/e/NAG.cpp:1480
 --  6# rawPathTrackerProjective at interface/matrix.cpp:928
 --  7# interface2_rawPathTrackerProjective at ./M2/Macaulay2/d/interface2.d:384
 --  8# evaluate_evalraw at ./M2/Macaulay2/d/evaluate.d:1504

d-torrance avatar Nov 10 '22 02:11 d-torrance