SIGFPE in L-BFGS geometry optimizer

Open foxtran opened this issue 1 year ago • 1 comments

Describe the bug

Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x1537ca6f6b4f in ???
#1  0xf3e494 in __xtb_david2_MOD_solver_sdavidson
	at xtb/src/david2.f90:340
#2  0xb6193f in __xtb_relaxation_engine_MOD_lbfgs_relax
	at xtb/src/relaxation_engine.f90:1136
#3  0xb70fd8 in __xtb_relaxation_engine_MOD_l_ancopt
	at xtb/src/relaxation_engine.f90:734
#4  0xa14e61 in __xtb_geoopt_MOD_geometry_optimization
	at xtb/src/geoopt_driver.f90:155
#5  0x426ad4 in __xtb_prog_main_MOD_xtbmain
	at xtb/src/prog/main.F90:842
#6  0x42d7c2 in xtb_prog_primary
	at xtb/src/prog/primary.f90:57
#7  0x42d863 in main
	at xtb/src/prog/primary.f90:20

To Reproduce

Happens with locally-modified version (smaller thresholds for geometry optimization: tighter than extreme, but not reached) based on current master branch with L-BFGS geometry optimizer. Some atoms are fixed (~15% of all of them).

Expected behaviour No SIGFPE.

Additional context

Last few geometry optimization steps produces large step length:

   gradient norm :     0.0006433 Eh/a0  converged?    E=F G=F D=F
   step length   :     0.0457340 a0
--
   gradient norm :     0.0004382 Eh/a0  converged?    E=F G=F D=F
   step length   :     0.0294820 a0
--
   gradient norm :     0.0004506 Eh/a0  converged?    E=F G=F D=F
   step length   :     0.0680663 a0
--
   gradient norm :     0.0005634 Eh/a0  converged?    E=F G=F D=F
   step length   :     0.0574646 a0
--
   gradient norm :     0.0005648 Eh/a0  converged?    E=F G=F D=F
   step length   :     0.0208301 a0

!!!!! Updated Hessian !!!!!
Using Lindh-Hessian (1995)
 Shifting diagonal of input Hessian by    1.0156336463987828E-002
 Lowest  eigenvalues of input Hessian
    0.010000    0.010156    0.010156    0.010156    0.010156    0.010156
    0.010156    0.010156    0.010156    0.010156    0.010156    0.010156
    0.010156    0.010156    0.010156    0.010156    0.010156    0.010156
 Highest eigenvalues
    2.349039    2.356476    2.365074    2.401702    2.465249    2.496497
!!!!

--
   gradient norm :     0.0138380 Eh/a0  converged?    E=F G=F D=F
   step length   :     0.0910316 a0
--
   gradient norm :     0.1517329 Eh/a0  converged?    E=F G=F D=F
   step length   :     1.6871844 a0
--
   gradient norm :     0.2867047 Eh/a0  converged?    E=F G=F D=F
   step length   :   500.2718670 a0
--
   gradient norm :     0.4277285 Eh/a0  converged?    E=F G=F D=F
   step length   :  2769.1668924 a0
--
   gradient norm :     0.5817691 Eh/a0  converged?    E=F G=F D=F
   step length   : 14982.8583466 a0
--
   gradient norm :     0.7541627 Eh/a0  converged?    E=F G=F D=F
   step length   : 76943.0922821 a0
--
   gradient norm :     0.9474460 Eh/a0  converged?    E=F G=F D=F
   step length   :372790.8534898 a0
--
   gradient norm :     1.1623631 Eh/a0  converged?    E=F G=F D=F
   step length   :************** a0
--
   gradient norm :     1.4004933 Eh/a0  converged?    E=F G=F D=F
   step length   :************** a0
--
   gradient norm :     1.6608298 Eh/a0  converged?    E=F G=F D=F
   step length   :************** a0
--
   gradient norm :     1.9448164 Eh/a0  converged?    E=F G=F D=F
   step length   :************** a0
--
   gradient norm :     2.2571612 Eh/a0  converged?    E=F G=F D=F
   step length   :************** a0

Lindh-Hessian was recomputed 5 times, but only the last one produces this weird behaviour.

Mar 25 '25 00:03 foxtran

Problematic line:

https://github.com/grimme-lab/xtb/blob/41059b44d84a07dfd99a051e4409ef654faca939/src/david2.f90#L340

Mar 25 '25 00:03 foxtran