abacus-develop icon indicating copy to clipboard operation
abacus-develop copied to clipboard

Bug: Make Build Fails for PW-only Configuration Due to LCAO Dependencies

Open jieli-matrix opened this issue 8 months ago • 1 comments

Describe the bug

We encountered issues when attempting to build a PW-only (Plane-Wave) version of ABACUS using the Make build system (Makefile, Makefile.vars, Makefile.Objects), specifically targeting a CUDA platform. While the equivalent CMake command (cmake -B build -DUSE_CUDA=ON -DENABLE_LCAO=OFF ...) works correctly, the Make build fails due to dependency issues related to LCAO components.

Could the developers please review the Make build configuration?

  1. The object list OBJS_ABACUS_PW in Makefile.Objects needs to be corrected to include all necessary non-LCAO modules (like container/OBJS_TENSOR, dftu/OBJS_DFTU, deltaspin/OBJS_DELTASPIN) required even for PW calculations.
  2. Code within files compiled for PW (like dos_nao.cpp) should be reviewed to ensure calls to LCAO-specific functions (like write_dos_lcao, nscf_fermi_surface) are properly guarded with #ifndef __LCAO preprocessor directives to prevent linking issues in PW-only builds.

Thank you!

Expected behavior

Configuration: - Makefile.vars was configured for CUDA (GPU=CUDA), MPI (CXX=mpicxx), required libraries (FFTW, OpenBLAS, ScaLAPACK based on system paths), and OpenMP. - In source/Makefile, the -D__LCAO and -D__ELPA flags within the HONG variable definition were commented out to disable LCAO features. - A conditional statement was added to source/Makefile to select OBJS_ABACUS_PW (defined in Makefile.Objects) as the source object list when -D__LCAO is not present in HONG. - The build was initiated using make (targeting the default abacus rule). The building process with make should be successful.

To Reproduce

  1. Initial Build Attempt: Failed during the linking stage with numerous undefined reference errors pointing to symbols within:

    • container::* (Tensor library components)
    • ModuleDFTU::* (DFT+U components)
    • spinconstrain::* (Spin Constrain components)
    • vdw::* (Van der Waals components)
    • Crucially, ModuleIO::write_dos_lcao and ModuleIO::nscf_fermi_surface, originating from the linking of obj/dos_nao.o.
  2. Attempt 1 (Adding Core Modules): We modified source/Makefile.Objects to add ${OBJS_TENSOR}, ${OBJS_DFTU}, ${OBJS_DELTASPIN}, and ${OBJS_VDW} to the OBJS_ABACUS_PW list. This resolved the undefined references for container, dftu, deltaspin, and vdw. However, the linking errors related to dos_nao.o (requiring write_dos_lcao and nscf_fermi_surface) persisted.

  3. Attempt 2 (Adding LCAO IO): To resolve the remaining linking errors from dos_nao.o, we added ${OBJS_IO_LCAO} to the OBJS_ABACUS_PW list.

    • Result: This successfully resolved the linking errors. However, it introduced new compilation errors. Files within OBJS_IO_LCAO (e.g., cal_r_overlap_R.cpp, write_dos_lcao.cpp, unk_overlap_lcao.cpp) failed to compile because they reference LCAO-specific members (like ucell.infoNL) or classes (like hamilt::HamiltLCAO) which are naturally unavailable or invalid in a PW-only build context (i.e., when __LCAO is not defined).

Environment

  • OS: Ubuntu 22.04
  • Compiler: g++, nvcc
  • Dependencies: FFTW, OPENBLAS, SCALAPACK

Additional Context

  1. make log in Attempt 1:
 b/x86_64-linux-gnu -o  ../bin/ABACUS.serial
/usr/bin/ld: obj/H_TDDFT_pw.o: warning: relocation against `_ZN12module_tddft11Evolve_elec17td_vext_dire_caseE' in read-only section `.text'
/usr/bin/ld: obj/H_TDDFT_pw.o: in function `elecstate::H_TDDFT_pw::cal_fixed_v(double*)':
H_TDDFT_pw.cpp:(.text+0x147a): undefined reference to `module_tddft::Evolve_elec::td_vext'
/usr/bin/ld: H_TDDFT_pw.cpp:(.text+0x15dd): undefined reference to `module_tddft::Evolve_elec::td_vext_dire_case'
/usr/bin/ld: H_TDDFT_pw.cpp:(.text+0x15e4): undefined reference to `module_tddft::Evolve_elec::td_vext_dire_case'
/usr/bin/ld: H_TDDFT_pw.cpp:(.text+0x1716): undefined reference to `module_tddft::Evolve_elec::out_efield'
/usr/bin/ld: obj/H_TDDFT_pw.o: in function `elecstate::H_TDDFT_pw::update_At()':
H_TDDFT_pw.cpp:(.text+0x2138): undefined reference to `module_tddft::Evolve_elec::td_vext'
/usr/bin/ld: H_TDDFT_pw.cpp:(.text+0x216d): undefined reference to `module_tddft::Evolve_elec::td_vext_dire_case'
/usr/bin/ld: H_TDDFT_pw.cpp:(.text+0x2174): undefined reference to `module_tddft::Evolve_elec::td_vext_dire_case'
/usr/bin/ld: H_TDDFT_pw.cpp:(.text+0x23a0): undefined reference to `module_tddft::Evolve_elec::out_efield'
/usr/bin/ld: obj/esolver_sdft_pw.o: in function `ModuleESolver::ESolver_SDFT_PW<std::complex<double>, base_device::DEVICE_CPU>::after_all_runners(UnitCell&)':
esolver_sdft_pw.cpp:(.text+0x1b0): undefined reference to `ModuleIO::write_istate_info(ModuleBase::matrix const&, ModuleBase::matrix const&, K_Vectors const&, Parallel_Kpoints const*)'
/usr/bin/ld: obj/to_qo_kernel.o: in function `toQO::~toQO()':
to_qo_kernel.cpp:(.text+0x189b): undefined reference to `RadialCollection::~RadialCollection()'
/usr/bin/ld: to_qo_kernel.cpp:(.text+0x18b7): undefined reference to `RadialCollection::~RadialCollection()'
/usr/bin/ld: obj/to_qo_kernel.o: in function `toQO::calculate_ovlpR(int)':
to_qo_kernel.cpp:(.text+0x22d2): undefined reference to `TwoCenterIntegrator::calculate(int, int, int, int, int, int, int, int, ModuleBase::Vector3<double> const&, double*, double*) const'
/usr/bin/ld: obj/to_qo_kernel.o: in function `toQO::build_nao(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const*, int)':
to_qo_kernel.cpp:(.text+0x4535): undefined reference to `RadialCollection::~RadialCollection()'
/usr/bin/ld: to_qo_kernel.cpp:(.text+0x4719): undefined reference to `RadialCollection::build(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const*, char)'
/usr/bin/ld: to_qo_kernel.cpp:(.text+0x4755): undefined reference to `RadialCollection::set_transformer(ModuleBase::SphericalBesselTransformer, int)'
/usr/bin/ld: obj/to_qo_kernel.o: in function `toQO::build_hydrogen(int, double const*, bool, int const*, double, int)':
to_qo_kernel.cpp:(.text+0x4de7): undefined reference to `RadialCollection::~RadialCollection()'
/usr/bin/ld: to_qo_kernel.cpp:(.text+0x4e1c): undefined reference to `RadialCollection::build(int, double const*, bool, int const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const*, double, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const*, int const&)'
/usr/bin/ld: to_qo_kernel.cpp:(.text+0x4e56): undefined reference to `RadialCollection::set_transformer(ModuleBase::SphericalBesselTransformer, int)'
/usr/bin/ld: obj/to_qo_kernel.o: in function `toQO::build_pswfc(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const*, double const*, double, int)':
to_qo_kernel.cpp:(.text+0x5115): undefined reference to `RadialCollection::~RadialCollection()'
/usr/bin/ld: to_qo_kernel.cpp:(.text+0x52ff): undefined reference to `RadialCollection::build(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const*, double const*, double, int const&)'
/usr/bin/ld: to_qo_kernel.cpp:(.text+0x5340): undefined reference to `RadialCollection::set_transformer(ModuleBase::SphericalBesselTransformer, int)'
/usr/bin/ld: obj/to_qo_kernel.o: in function `toQO::build_szv()':
to_qo_kernel.cpp:(.text+0x5714): undefined reference to `RadialCollection::RadialCollection(RadialCollection const&)'
/usr/bin/ld: to_qo_kernel.cpp:(.text+0x572f): undefined reference to `RadialCollection::~RadialCollection()'
/usr/bin/ld: to_qo_kernel.cpp:(.text+0x5775): undefined reference to `RadialCollection::set_transformer(ModuleBase::SphericalBesselTransformer, int)'
/usr/bin/ld: obj/to_qo_kernel.o: in function `toQO::initialize(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, UnitCell const*, std::vector<ModuleBase::Vector3<double>, std::allocator<ModuleBase::Vector3<double> > > const&, std::basic_ofstream<char, std::char_traits<char> >&, int const&, int const&)':
to_qo_kernel.cpp:(.text+0x6a4a): undefined reference to `TwoCenterIntegrator::TwoCenterIntegrator()'
/usr/bin/ld: to_qo_kernel.cpp:(.text+0x6c72): undefined reference to `RadialCollection::set_uniform_grid(bool, int, double, char, bool)'
/usr/bin/ld: to_qo_kernel.cpp:(.text+0x6c98): undefined reference to `RadialCollection::set_uniform_grid(bool, int, double, char, bool)'
/usr/bin/ld: to_qo_kernel.cpp:(.text+0x6cc2): undefined reference to `TwoCenterIntegrator::tabulate(RadialCollection const&, RadialCollection const&, char, int, double)'
/usr/bin/ld: obj/to_qo_kernel.o: in function `std::unique_ptr<RadialCollection, std::default_delete<RadialCollection> >::~unique_ptr()':
to_qo_kernel.cpp:(.text._ZNSt10unique_ptrI16RadialCollectionSt14default_deleteIS0_EED2Ev[_ZNSt10unique_ptrI16RadialCollectionSt14default_deleteIS0_EED5Ev]+0x11): undefined reference to `RadialCollection::~RadialCollection()'
/usr/bin/ld: obj/output_mat_sparse.o: in function `void ModuleIO::output_mat_sparse<std::complex<double> >(bool const&, bool const&, bool const&, bool const&, int const&, ModuleBase::matrix const&, Parallel_Orbitals const&, Gint_k&, TwoCenterBundle const&, LCAO_Orbitals const&, UnitCell&, Grid_Driver const&, K_Vectors const&, hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*)':
output_mat_sparse.cpp:(.text+0x479): undefined reference to `cal_r_overlap_R::cal_r_overlap_R()'
/usr/bin/ld: output_mat_sparse.cpp:(.text+0x495): undefined reference to `cal_r_overlap_R::init(UnitCell const&, Parallel_Orbitals const&, LCAO_Orbitals const&)'
/usr/bin/ld: output_mat_sparse.cpp:(.text+0x4b3): undefined reference to `cal_r_overlap_R::out_rR(UnitCell const&, Grid_Driver const&, int const&)'
/usr/bin/ld: output_mat_sparse.cpp:(.text+0x4bf): undefined reference to `cal_r_overlap_R::~cal_r_overlap_R()'
/usr/bin/ld: output_mat_sparse.cpp:(.text+0x505): undefined reference to `cal_r_overlap_R::out_rR_other(UnitCell const&, int const&, std::set<Abfs::Vector3_Order<int>, std::less<Abfs::Vector3_Order<int> >, std::allocator<Abfs::Vector3_Order<int> > > const&)'
/usr/bin/ld: output_mat_sparse.cpp:(.text+0x568): undefined reference to `ModuleIO::output_dHR(int const&, ModuleBase::matrix const&, Gint_k&, UnitCell const&, Parallel_Orbitals const&, LCAO_HS_Arrays&, Grid_Driver const&, TwoCenterBundle const&, LCAO_Orbitals const&, K_Vectors const&, bool const&, double const&)'
/usr/bin/ld: output_mat_sparse.cpp:(.text+0x64d): undefined reference to `ModuleIO::output_TR(int, UnitCell const&, Parallel_Orbitals const&, LCAO_HS_Arrays&, Grid_Driver const&, TwoCenterBundle const&, LCAO_Orbitals const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool const&, double const&)'
/usr/bin/ld: output_mat_sparse.cpp:(.text+0x87f): undefined reference to `ModuleIO::output_HSR(UnitCell const&, int const&, ModuleBase::matrix const&, Parallel_Orbitals const&, LCAO_HS_Arrays&, Grid_Driver const&, K_Vectors const&, hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool const&, double const&)'
/usr/bin/ld: obj/output_mat_sparse.o: in function `void ModuleIO::output_mat_sparse<std::complex<double> >(bool const&, bool const&, bool const&, bool const&, int const&, ModuleBase::matrix const&, Parallel_Orbitals const&, Gint_k&, TwoCenterBundle const&, LCAO_Orbitals const&, UnitCell&, Grid_Driver const&, K_Vectors const&, hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*) [clone .cold]':
output_mat_sparse.cpp:(.text.unlikely+0x2e): undefined reference to `cal_r_overlap_R::~cal_r_overlap_R()'
/usr/bin/ld: obj/esolver_ks_pw.o: in function `ModuleESolver::ESolver_KS_PW<std::complex<float>, base_device::DEVICE_CPU>::after_all_runners(UnitCell&)':
esolver_ks_pw.cpp:(.text._ZN13ModuleESolver13ESolver_KS_PWISt7complexIfEN11base_device10DEVICE_CPUEE17after_all_runnersER8UnitCell[_ZN13ModuleESolver13ESolver_KS_PWISt7complexIfEN11base_device10DEVICE_CPUEE17after_all_runnersER8UnitCell]+0x213): undefined reference to `ModuleIO::write_istate_info(ModuleBase::matrix const&, ModuleBase::matrix const&, K_Vectors const&, Parallel_Kpoints const*)'
/usr/bin/ld: obj/esolver_ks_pw.o: in function `ModuleESolver::ESolver_KS_PW<std::complex<double>, base_device::DEVICE_CPU>::after_all_runners(UnitCell&)':
esolver_ks_pw.cpp:(.text._ZN13ModuleESolver13ESolver_KS_PWISt7complexIdEN11base_device10DEVICE_CPUEE17after_all_runnersER8UnitCell[_ZN13ModuleESolver13ESolver_KS_PWISt7complexIdEN11base_device10DEVICE_CPUEE17after_all_runnersER8UnitCell]+0x213): undefined reference to `ModuleIO::write_istate_info(ModuleBase::matrix const&, ModuleBase::matrix const&, K_Vectors const&, Parallel_Kpoints const*)'
/usr/bin/ld: obj/dos_nao.o: in function `void ModuleIO::out_dos_nao<double>(psi::Psi<double, base_device::DEVICE_CPU> const*, Parallel_Orbitals const&, ModuleBase::matrix const&, ModuleBase::matrix const&, double const&, double const&, double const&, K_Vectors const&, Parallel_Kpoints const&, UnitCell const&, elecstate::efermi const&, int, hamilt::Hamilt<double, base_device::DEVICE_CPU>*)':
dos_nao.cpp:(.text._ZN8ModuleIO11out_dos_naoIdEEvPKN3psi3PsiIT_N11base_device10DEVICE_CPUEEERK17Parallel_OrbitalsRKN10ModuleBase6matrixESF_RKdSH_SH_RK9K_VectorsRK16Parallel_KpointsRK8UnitCellRKN9elecstate6efermiEiPN6hamilt6HamiltIS3_S5_EE[_ZN8ModuleIO11out_dos_naoIdEEvPKN3psi3PsiIT_N11base_device10DEVICE_CPUEEERK17Parallel_OrbitalsRKN10ModuleBase6matrixESF_RKdSH_SH_RK9K_VectorsRK16Parallel_KpointsRK8UnitCellRKN9elecstate6efermiEiPN6hamilt6HamiltIS3_S5_EE]+0x1be): undefined reference to `void ModuleIO::write_dos_lcao<double>(UnitCell const&, psi::Psi<double, base_device::DEVICE_CPU> const*, Parallel_Orbitals const&, ModuleBase::matrix const&, ModuleBase::matrix const&, double const&, double const&, double const&, K_Vectors const&, hamilt::Hamilt<double, base_device::DEVICE_CPU>*)'
/usr/bin/ld: dos_nao.cpp:(.text._ZN8ModuleIO11out_dos_naoIdEEvPKN3psi3PsiIT_N11base_device10DEVICE_CPUEEERK17Parallel_OrbitalsRKN10ModuleBase6matrixESF_RKdSH_SH_RK9K_VectorsRK16Parallel_KpointsRK8UnitCellRKN9elecstate6efermiEiPN6hamilt6HamiltIS3_S5_EE[_ZN8ModuleIO11out_dos_naoIdEEvPKN3psi3PsiIT_N11base_device10DEVICE_CPUEEERK17Parallel_OrbitalsRKN10ModuleBase6matrixESF_RKdSH_SH_RK9K_VectorsRK16Parallel_KpointsRK8UnitCellRKN9elecstate6efermiEiPN6hamilt6HamiltIS3_S5_EE]+0x51c): undefined reference to `ModuleIO::nscf_fermi_surface(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int const&, double const&, K_Vectors const&, Parallel_Kpoints const&, UnitCell const&, ModuleBase::matrix const&)'
/usr/bin/ld: obj/dos_nao.o: in function `void ModuleIO::out_dos_nao<std::complex<double> >(psi::Psi<std::complex<double>, base_device::DEVICE_CPU> const*, Parallel_Orbitals const&, ModuleBase::matrix const&, ModuleBase::matrix const&, double const&, double const&, double const&, K_Vectors const&, Parallel_Kpoints const&, UnitCell const&, elecstate::efermi const&, int, hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*)':
dos_nao.cpp:(.text._ZN8ModuleIO11out_dos_naoISt7complexIdEEEvPKN3psi3PsiIT_N11base_device10DEVICE_CPUEEERK17Parallel_OrbitalsRKN10ModuleBase6matrixESH_RKdSJ_SJ_RK9K_VectorsRK16Parallel_KpointsRK8UnitCellRKN9elecstate6efermiEiPN6hamilt6HamiltIS5_S7_EE[_ZN8ModuleIO11out_dos_naoISt7complexIdEEEvPKN3psi3PsiIT_N11base_device10DEVICE_CPUEEERK17Parallel_OrbitalsRKN10ModuleBase6matrixESH_RKdSJ_SJ_RK9K_VectorsRK16Parallel_KpointsRK8UnitCellRKN9elecstate6efermiEiPN6hamilt6HamiltIS5_S7_EE]+0x1be): undefined reference to `void ModuleIO::write_dos_lcao<std::complex<double> >(UnitCell const&, psi::Psi<std::complex<double>, base_device::DEVICE_CPU> const*, Parallel_Orbitals const&, ModuleBase::matrix const&, ModuleBase::matrix const&, double const&, double const&, double const&, K_Vectors const&, hamilt::Hamilt<std::complex<double>, base_device::DEVICE_CPU>*)'
/usr/bin/ld: dos_nao.cpp:(.text._ZN8ModuleIO11out_dos_naoISt7complexIdEEEvPKN3psi3PsiIT_N11base_device10DEVICE_CPUEEERK17Parallel_OrbitalsRKN10ModuleBase6matrixESH_RKdSJ_SJ_RK9K_VectorsRK16Parallel_KpointsRK8UnitCellRKN9elecstate6efermiEiPN6hamilt6HamiltIS5_S7_EE[_ZN8ModuleIO11out_dos_naoISt7complexIdEEEvPKN3psi3PsiIT_N11base_device10DEVICE_CPUEEERK17Parallel_OrbitalsRKN10ModuleBase6matrixESH_RKdSJ_SJ_RK9K_VectorsRK16Parallel_KpointsRK8UnitCellRKN9elecstate6efermiEiPN6hamilt6HamiltIS5_S7_EE]+0x51c): undefined reference to `ModuleIO::nscf_fermi_surface(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int const&, double const&, K_Vectors const&, Parallel_Kpoints const&, UnitCell const&, ModuleBase::matrix const&)'
/usr/bin/ld: obj/onsite_projector.o: in function `projectors::OnsiteProjector<double, base_device::DEVICE_CPU>::init_proj(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&, std::vector<int, std::allocator<int> > const&, std::vector<int, std::allocator<int> > const&, std::vector<int, std::allocator<int> > const&, std::vector<double, std::allocator<double> > const&)':
onsite_projector.cpp:(.text._ZN10projectors15OnsiteProjectorIdN11base_device10DEVICE_CPUEE9init_projERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt6vectorIS9_SaIS9_EERKSC_IiSaIiEESK_SK_RKSC_IdSaIdEE[_ZN10projectors15OnsiteProjectorIdN11base_device10DEVICE_CPUEE9init_projERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt6vectorIS9_SaIS9_EERKSC_IiSaIiEESK_SK_RKSC_IdSaIdEE]+0x942): undefined reference to `smoothgen(int, double const*, double const*, double, std::vector<double, std::allocator<double> >&)'
/usr/bin/ld: warning: creating DT_TEXTREL in a PIE
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:234: ../bin/ABACUS.serial] Error 1
make[1]: Leaving directory '/root/abacus-develop/source'
make: *** [Makefile:222: abacus] Error 2
  1. make log in Attempt 2Image

Task list for Issue attackers (only for developers)

  • [ ] Verify the issue is not a duplicate.
  • [ ] Describe the bug.
  • [ ] Steps to reproduce.
  • [ ] Expected behavior.
  • [ ] Error message.
  • [ ] Environment details.
  • [ ] Additional context.
  • [ ] Assign a priority level (low, medium, high, urgent).
  • [ ] Assign the issue to a team member.
  • [ ] Label the issue with relevant tags.
  • [ ] Identify possible related issues.
  • [ ] Create a unit test or automated test to reproduce the bug (if applicable).
  • [ ] Fix the bug.
  • [ ] Test the fix.
  • [ ] Update documentation (if necessary).
  • [ ] Close the issue and inform the reporter (if applicable).

jieli-matrix avatar Apr 28 '25 12:04 jieli-matrix

Thanks! It is strange that we don't compile this version in the CICD.

mohanchen avatar Apr 28 '25 13:04 mohanchen