Segmentation fault with input containing 24 variables and 9 equations
The input:
$ cat input
a_1,a_2,a_3,a_4,a_5,a_6,a_7,a_8,a_9,a_10,a_11,a_12,b_1,b_2,b_3,b_4,b_5,b_6,b_7,b_8,b_9,b_10,b_11,b_12
0
-a_2-2*b_3,
-2*a_3*b_2+2*a_2*b_3-a_6-3*b_7,
-3*a_7*b_2+2*a_6*b_3-2*a_3*b_6-3*a_2*b_7-a_11-4*b_12,
-4*a_12*b_2+2*a_11*b_3-3*a_7*b_6+3*a_6*b_7-2*a_3*b_11+4*a_2*b_12,
-4*a_12*b_6+3*a_11*b_7-3*a_7*b_11+4*a_6*b_12,
-4*a_12*b_11+4*a_11*b_12,
-2*a_1-b_2,
-4*a_3*b_1+4*a_1*b_3-2*a_5-2*b_6,
-6*a_7*b_1-a_6*b_2+4*a_5*b_3-4*a_3*b_5+a_2*b_6+6*a_1*b_7-2*a_10-3*b_11
It raises a Segmentation fault:
$ msolve -f input -g 2 -v 2
--------------- INPUT DATA ---------------
#variables 24
#equations 9
#invalid equations 0
field characteristic 0
homogeneous input? 0
signature-based computation 0
monomial order DRL
basis hash table resetting OFF
linear algebra option 2
initial hash table size 131072 (2^17)
max pair selection ALL
reduce gb 1
#threads 1
info level 2
generate pbm files 0
------------------------------------------
Legend for f4 information
--------------------------------------------------------
deg current degree of pairs selected in this round
sel number of pairs selected in this round
pairs total number of pairs in pair list
mat matrix dimensions (# rows x # columns)
density density of the matrix
new data # new elements for basis in this round
# zero reductions during linear algebra
time(rd) time of the current f4 round in seconds given
for real and cpu time
--------------------------------------------------------
deg sel pairs mat density new data time(rd) in sec (real|cpu)
------------------------------------------------------------------------------------------------------
3 9 9 46 x 120 2.97% 9 new 0 zero 0.00 | 0.00
4 28 30 296 x 624 0.76% 8 new 20 zero 0.00 | 0.00
5 39 40 907 x 2285 0.29% 11 new 28 zero 0.00 | 0.00
6 62 66 3840 x 8331 0.08% 8 new 54 zero 0.01 | 0.01
7 56 59 9898 x 19603 0.04% 9 new 47 zero 0.01 | 0.01
8 68 68 30408 x 54067 0.02% 14 new 54 zero 0.05 | 0.05
9 100 103 102487 x 158680 0.01% 10 new 90 zero 0.19 | 0.19
10 80 80 164467 x 246105 0.00% 12 new 68 zero 0.34 | 0.34
11 89 91 393313 x 549256 0.00% 8 new 81 zero 0.92 | 0.91
12 58 60 459161 x 629834 0.00% 6 new 52 zero 1.03 | 1.03
13 44 51 747094 x 955833 0.00% 2 new 42 zero 1.78 | 1.78
14 16 22 231366 x 314556 0.00% 3 new 13 zero 0.48 | 0.48
15 22 31 1088698 x 1347614 0.00% 2 new 20 zero 2.82 | 2.82
16 17 24 1316137 x 1611550 0.00% 3 new 14 zero 3.41 | 3.41
17 24 28 3101371 x 3743654 0.00% 3 new 21 zero 9.45 | 9.45
18 27 28 4112873 x 4852959 0.00% 3 new 24 zero 13.49 | 13.49
19 22 22 3719714 x 4381914 0.00% 1 new 21 zero 11.49 | 11.49
20 7 7 2153194 x 2535385 0.00% 0 new 7 zero 6.20 | 6.20
------------------------------------------------------------------------------------------------------
reduce final basis 127 x 383158 0.82% 121 new 0 zero 0.26 | 0.26
------------------------------------------------------------------------------------------------------
---------------- TIMINGS ---------------
overall(elapsed) 51.92 sec
overall(cpu) 51.91 sec
select 0.54 sec 1.0%
symbolic prep. 36.30 sec 69.9%
update 0.00 sec 0.0%
convert 8.84 sec 17.0%
linear algebra 3.87 sec 7.5%
reduce gb 0.00 sec 0.0%
-----------------------------------------
---------- COMPUTATIONAL DATA -----------
size of basis 121
#terms in basis 399751
#pairs reduced 768
#GM criterion 6492
#redundant elements 0
#rows reduced 1657
#zero reductions 656
max. matrix data 4112873 x 4852959 (0.000%)
max. symbolic hash table size 2^23
max. basis hash table size 2^23
-----------------------------------------
Learning phase 0.00 Gops/sec
Erreur de segmentation (core dumped)
This is with the msolve currently in sagemath, that is, version 0.6.5. I was able to reproduce the same issue with msolve 0.9.1.
Many thanks. I just tried with v0.9.2 on my laptop (an i7 intel running under ubuntu) and it worked perfectly well. Could you try with v0.9.2? Due to the nature on the changes with v0.9.1, I don't think the problem is really solved in v0.9.2 but let us check that. If the problem persists, could you tell us more on your architecture?
I tested the example in v0.9.2 and in v0.9.1 on ARM64, but the issue does not appear. msolve correctly computes the GB for me.
Could you try with v0.9.2?
I first tried to install the latest version (0.9.2), but I was not able to compile it. This is why I tried 0.9.1 instead which is the one advertized on your website. I just created #239 to explain my issue to avoid having this discussion here.
I have access to another machine (plafrim) on which I could install msolve-0.8.0 (with guix). I confirm that the issue does not appear on this other machine.
If the problem persists, could you tell us more on your architecture?
Here is the output of lscpu on the machine on which I have the segmentation fault. Sorry for the long output, I don't know what interest you:
$ lscpu
Architecture : x86_64
Mode(s) opératoire(s) des processeurs : 32-bit, 64-bit
Address sizes: 39 bits physical, 48 bits virtual
Boutisme : Little Endian
Processeur(s) : 8
Liste de processeur(s) en ligne : 0-7
Identifiant constructeur : GenuineIntel
Nom de modèle : Intel(R) Core(TM) i7-10610U CPU @ 1.80GHz
Famille de processeur : 6
Modèle : 142
Thread(s) par cœur : 2
Cœur(s) par socket : 4
Socket(s) : 1
Révision : 12
Vitesse maximale du processeur en MHz : 4900,0000
Vitesse minimale du processeur en MHz : 400,0000
BogoMIPS : 4599.93
Drapaux : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtsc
p lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nons
top_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est
tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcn
t tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch
cpuid_fault epb invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_s
hadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 sm
ep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xs
avec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_windo
w hwp_epp md_clear flush_l1d arch_capabilities
Virtualization features:
Virtualisation : VT-x
Caches (sum of all):
L1d: 128 KiB (4 instances)
L1i: 128 KiB (4 instances)
L2: 1 MiB (4 instances)
L3: 8 MiB (1 instance)
NUMA:
Nœud(s) NUMA : 1
Nœud NUMA 0 de processeur(s) : 0-7
Vulnerabilities:
Gather data sampling: Mitigation; Microcode
Indirect target selection: Mitigation; Aligned branch/return thunks
Itlb multihit: KVM: Mitigation: VMX disabled
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable
Reg file data sampling: Not affected
Retbleed: Mitigation; EnhanMany thanks. I just tried with v0.9.2 on my laptop (an i7 intel running under ubuntu) and it worked perfectly well. Could you try with v0.9.2? Due to the nature on the changes with v0.9.1, I don't think the problem is really solved in v0.9.2 but let us check that. ced IBRS
Spec rstack overflow: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB conditional; PBRSB-eIBRS SW
sequence; BHI SW loop, KVM SW loop
Srbds: Mitigation; Microcode
Tsx async abort: Mitigation; TSX disabled
Also, here is the output of htop:
0[|| 3.9%] 4[|| 3.9%]
1[|| 4.0%] 5[|| 3.3%]
2[|| 5.8%] 6[| 2.0%]
3[|| 3.3%] 7[||| 5.8%]
Mem[|||||||||||||||||||8.88G/15.3G] Tasks: 210, 1749 thr; 1 running
Swp[||||| 590M/4.00G] Load average: 0.21 0.34 0.41
Uptime: 10 days, 00:46:25
I will try again after a fresh reboot, and will report later. I have many tabs open on my firefox. I don't know if this may be competing for memory.
I will try again after a fresh reboot, and will report later. I have many tabs open on my firefox. I don't know if this may be competing for memory.
After a fresh reboot, I retried it and it fails again the same. I checked the memory usage in the htop window.
In the first part of the computation, the memory usage went from ~1.5 G to ~3.5 G, with not problem.
Then, during the "Learning phase", the memory usage started to grow to 4 G, then 5 G, then quickly 7 G, then the program stops with Erreur de segmentation (core dumped).
Doing top -o %MEM -c -d .5 during the execution shows the VIRT column reach 24.4g just before the segmentation fault.
Many thanks, we already identified that indeed, on such computations over the rationals, we were using more memory than needed. This is on top of my todo list but it is unlikely that I can fix this in the next 10 days. Hopefully, in one month, this will be fixed. Meanwhile, if you could share what gdb returns, it will help.
Here is what I obtain with msolve 0.6.5 installed with sagemath. I needed to write "continue" once during the first phase of the computation because of order.c: Aucun fichier ou dossier de ce nom. The segmentation fault happens after memmove-vec-unaligned-erms.S: Aucun fichier ou dossier de ce nom..
Full output below.
$ msolve -f input -o output -g 2 -v
$ sudo gdb -p 262724
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04.2) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 262724
Reading symbols from /home/slabbe/GitBox/sage/local/bin/msolve...
Reading symbols from /home/slabbe/GitBox/sage/local/lib/libneogb-0.6.5.so...
Reading symbols from /home/slabbe/GitBox/sage/local/lib/libflint.so.19...
Reading symbols from /lib/x86_64-linux-gnu/libgmp.so.10...
(No debugging symbols found in /lib/x86_64-linux-gnu/libgmp.so.10)
Reading symbols from /lib/x86_64-linux-gnu/libm.so.6...
Reading symbols from /usr/lib/debug/.build-id/a3/ad9bb40b4907e509e4404cb972645c19675ca3.debug...
Reading symbols from /lib/x86_64-linux-gnu/libgomp.so.1...
(No debugging symbols found in /lib/x86_64-linux-gnu/libgomp.so.1)
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...
Reading symbols from /usr/lib/debug/.build-id/d5/197096f709801829b118af1b7cf6631efa2dcd.debug...
Reading symbols from /lib/x86_64-linux-gnu/libmpfr.so.6...
(No debugging symbols found in /lib/x86_64-linux-gnu/libmpfr.so.6)
Reading symbols from /lib64/ld-linux-x86-64.so.2...
Reading symbols from /usr/lib/debug/.build-id/9c/b53985768bb99f138f48655f7b8bf7e420d13d.debug...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f708377a2a2 in monomial_cmp_pivots_drl (ht=0x561aacc4dfa0, b=307210, a=307209) at /home/slabbe/GitBox/sage/local/var/tmp/sage/build/msolve-0.6.5/src/src/neogb/order.c:469
469 /home/slabbe/GitBox/sage/local/var/tmp/sage/build/msolve-0.6.5/src/src/neogb/order.c: Aucun fichier ou dossier de ce nom.
(gdb) continue
Continuing.
Program received signal SIGSEGV, Segmentation fault.
__memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:429
429 ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: Aucun fichier ou dossier de ce nom.
(gdb) continue
Continuing.
Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
(gdb)