wannier90 icon indicating copy to clipboard operation
wannier90 copied to clipboard

Larger relative errors for some of the tests when compiling with LLVM based intel oneapi compilers.

Open hongyi-zhao opened this issue 6 months ago • 4 comments

On Ubuntu 22.04.4 LTS, I tried to compile the wannier90 v3.1.0 release with intel oneapi 2023.2.0 and I noticed that there are most parallel benchmark tests failed, as shown below:

werner@x13dai-t:~/wannier90-3.1.0$ make test-parallel 
(cd ./src/obj && make -f ../Makefile.2 serialobjs)
make[1]: Entering directory '/home/werner/wannier90-3.1.0/src/obj'
make[1]: Nothing to be done for 'serialobjs'.
make[1]: Leaving directory '/home/werner/wannier90-3.1.0/src/obj'
(cd ./src/obj && make -f ../Makefile.2 w90chk2chk)
make[1]: Entering directory '/home/werner/wannier90-3.1.0/src/obj'
make[1]: Nothing to be done for 'w90chk2chk'.
make[1]: Leaving directory '/home/werner/wannier90-3.1.0/src/obj'
(cd ./src/obj && make -f ../Makefile.2 wannier)
make[1]: Entering directory '/home/werner/wannier90-3.1.0/src/obj'
make[1]: Nothing to be done for 'wannier'.
make[1]: Leaving directory '/home/werner/wannier90-3.1.0/src/obj'
(cd ./src/objp && make -f ../Makefile.2 post)
make[1]: Entering directory '/home/werner/wannier90-3.1.0/src/objp'
make[1]: Nothing to be done for 'post'.
make[1]: Leaving directory '/home/werner/wannier90-3.1.0/src/objp'
(cd ./test-suite && ./run_tests --category=default --numprocs=4 )
Abort(1615247) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(176)........: 
MPID_Init(1548)..............: 
MPIDI_OFI_mpi_init_hook(1632): 
create_vni_context(2208).....: OFI endpoint open failed (ofi_init.c:2208:create_vni_context:Invalid argument)
[unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=1615247
:
system msg for write_line failure : Bad file descriptor
Abort(1615247) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(176)........: 
MPID_Init(1548)..............: 
MPIDI_OFI_mpi_init_hook(1632): 
create_vni_context(2208).....: OFI endpoint open failed (ofi_init.c:2208:create_vni_context:Invalid argument)
[unset]: write_line error; fd=-1 buf=:cmd=abort exitcode=1615247
:
system msg for write_line failure : Bad file descriptor
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source             
libc.so.6          000014CA3CC42520  Unknown               Unknown  Unknown
libmpi.so.12.0.0   000014CA3D412BE1  MPIR_Err_return_c     Unknown  Unknown
libmpi.so.12.0.0   000014CA3D5B8ED0  MPI_Init              Unknown  Unknown
libmpifort.so.12.  000014CA3ECE348B  PMPI_INIT             Unknown  Unknown
w90chk2chk.x       000000000054E22C  Unknown               Unknown  Unknown
w90chk2chk.x       0000000000405467  Unknown               Unknown  Unknown
w90chk2chk.x       000000000040540D  Unknown               Unknown  Unknown
libc.so.6          000014CA3CC29D90  Unknown               Unknown  Unknown
libc.so.6          000014CA3CC29E40  __libc_start_main     Unknown  Unknown
w90chk2chk.x       0000000000405325  Unknown               Unknown  Unknown
make[1]: *** [Makefile:6: silicon.chk] Error 174
[...]
*** WARNING!! make failed... continuing anyway, but your tests might fail
Using executable: /home/werner/wannier90-3.1.0/test-suite/tests/../../postw90.x.
Using executable: /home/werner/wannier90-3.1.0/test-suite/tests/../../wannier90.x.
Test id: 03082024.
Benchmark: default.

tests/testpostw90_boltzwann - silicon.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_example04_dos - copper.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_example04_pdos - copper.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_fe_ahc - Fe.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_fe_ahc_adaptandfermi - Fe.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_fe_kpathcurv - Fe.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_fe_kpathmorbcurv - Fe.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_fe_kslicecurv - Fe.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_fe_kslicemorb - Fe.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_fe_kubo_Axy - Fe.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_fe_kubo_Szz - Fe.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_fe_kubo_jdos - Fe.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_fe_morb - Fe.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_fe_morbandahc - Fe.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_fe_spin - Fe.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_gaas_sc_xyz - gaas.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_gaas_shc - GaAs.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_pt_kpathbandsshc - Pt.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_pt_kpathshc - Pt.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_pt_ksliceshc - Pt.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_pt_shc - Pt.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_si_geninterp - silicon.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_si_geninterp_wsdistance - silicon.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_te_gyrotropic - Te.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_te_gyrotropic_C - Te.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_te_gyrotropic_D0 - Te.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_te_gyrotropic_Dw - Te.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_te_gyrotropic_K - Te.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_te_gyrotropic_NOA - Te.win: **FAILED**.
Error running job.  Return code: 1

tests/testpostw90_te_gyrotropic_dos - Te.win: **FAILED**.
Error running job.  Return code: 1

tests/testw90_basic1 - wannier.win: Passed.

tests/testw90_basic2 - wannier.win: Passed.

tests/testw90_benzene_gamma_val - benzene.win: **FAILED**.
Error running job.  Return code: 174

tests/testw90_benzene_gamma_val_hexcell - benzene.win: **FAILED**.
Error running job.  Return code: 174

tests/testw90_benzene_gamma_valcond - benzene.win: **FAILED**.
Error running job.  Return code: 174

tests/testw90_bvec - lead.win: Passed.

tests/testw90_cube_format - gaas.win: Passed.

tests/testw90_disentanglement_sawfs - H3S.win: Passed.

tests/testw90_example01 - gaas.win: Passed.

tests/testw90_example02 - lead.win: Passed.

tests/testw90_example02_restart - lead.win: **FAILED**.
Error running job.  Return code: 1

tests/testw90_example03 - silicon.win: Passed.

tests/testw90_example03_labelinfo - silicon.win: Passed.

tests/testw90_example03_optmem - silicon.win: Passed.

tests/testw90_example04 - copper.win: Passed.

tests/testw90_example05 - diamond.win: Passed.

tests/testw90_example07 - silane.win: **FAILED**.
Error running job.  Return code: 174

tests/testw90_example11_1 - silicon.win: Passed.

tests/testw90_example11_2 - silicon.win: Passed.

tests/testw90_example21_As_sp - GaAs.win: Passed.

tests/testw90_example26 - gaas.win: Passed.

tests/testw90_gaas_disentanglement_issue192 - gaas.win: Passed.

tests/testw90_lavo3_dissphere - LaVO3.win: Passed.

tests/testw90_na_chain_gamma - Na_chain.win: **FAILED**.
Error running job.  Return code: 174

tests/testw90_nnkpt1 - wannier.win (arg(s): -pp): Passed.

tests/testw90_nnkpt2 - wannier.win (arg(s): -pp): Passed.

tests/testw90_nnkpt3 - wannier.win (arg(s): -pp): Passed.

tests/testw90_nnkpt4 - wannier.win (arg(s): -pp): Passed.

tests/testw90_nnkpt5 - wannier.win: Passed.

tests/testw90_precond_1 - gaas1.win: Passed.

tests/testw90_precond_2 - gaas2.win: Passed.

tests/testw90_write_u_matrices - gaas.win: Passed.

All done. ERROR: only 26 out of 62 tests passed.
Failed tests in:
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_boltzwann/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_example04_dos/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_example04_pdos/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_fe_ahc/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_fe_ahc_adaptandfermi/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_fe_kpathcurv/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_fe_kpathmorbcurv/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_fe_kslicecurv/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_fe_kslicemorb/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_fe_kubo_Axy/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_fe_kubo_Szz/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_fe_kubo_jdos/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_fe_morb/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_fe_morbandahc/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_fe_spin/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_gaas_sc_xyz/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_gaas_shc/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_pt_kpathbandsshc/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_pt_kpathshc/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_pt_ksliceshc/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_pt_shc/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_si_geninterp/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_si_geninterp_wsdistance/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_te_gyrotropic/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_te_gyrotropic_C/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_te_gyrotropic_D0/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_te_gyrotropic_Dw/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_te_gyrotropic_K/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_te_gyrotropic_NOA/
	/home/werner/wannier90-3.1.0/test-suite/tests/testpostw90_te_gyrotropic_dos/
	/home/werner/wannier90-3.1.0/test-suite/tests/testw90_benzene_gamma_val/
	/home/werner/wannier90-3.1.0/test-suite/tests/testw90_benzene_gamma_val_hexcell/
	/home/werner/wannier90-3.1.0/test-suite/tests/testw90_benzene_gamma_valcond/
	/home/werner/wannier90-3.1.0/test-suite/tests/testw90_example02_restart
	/home/werner/wannier90-3.1.0/test-suite/tests/testw90_example07/
	/home/werner/wannier90-3.1.0/test-suite/tests/testw90_na_chain_gamma/
make: *** [Makefile:189: test-parallel] Error 1

Any hints/comments on this issue will be helpful.

See https://github.com/wannier-developers/wannier90/issues/512 for the related discussion.

Regards, Zhao

hongyi-zhao avatar Aug 03 '24 13:08 hongyi-zhao