Bug: Please enable floatfftw in the cmake to use float fft when precision is set to single using GPU

Open OutisLi opened this issue 8 months ago • 0 comments

Describe the bug

I use this INPUT file to test my build from source code:

INPUT_PARAMETERS
#Parameters	(General)
calculation             scf
pseudo_dir		.
orbital_dir 		.
ntype 1
#Parameters (Accuracy)
symmetry                1
basis_type		pw	# PW; LCAO in pw; LCAO
ecutwfc			100	# energy cutoff for wave functions
scf_nmax		100
scf_thr 1e-6 #1e-5
kspacing 0.05
cal_force 1
cal_stress 1
smearing_method  gauss
smearing_sigma 0.015
mixing_type broyden
mixing_beta 0.3 

ks_solver bpcg

device gpu
precision single

But it cannot run and throws a error shown : Please enable floatfftw in the cmake to use float fft

Expected behavior

No response

To Reproduce

I build abacus from source code using toolchain: file toolchain_gnu.sh:

#!/bin/bash
#SBATCH -J install
#SBATCH -N 1
#SBATCH -n 16
#SBATCH -o compile.log
#SBATCH -e compile.err

# JamesMisaka in 2023-09-16
# install abacus dependency by gnu-toolchain
# one can use mpich or openmpi.
# openmpi will be faster, but not compatible in some cases.
# libtorch and libnpy are for deepks support, which can be =no
# if you want to run EXX calculation, you should set --with-libri=install
# mpich (and intel toolchain) is recommended for EXX support
# gpu-lcao supporting modify: CUDA_PATH and --enable-cuda
# export CUDA_PATH=/usr/local/cuda

./install_abacus_toolchain.sh \
--with-gcc=system \
--with-intel=no \
--with-openblas=install \
--with-openmpi=install \
--with-cmake=install \
--with-scalapack=install \
--with-libxc=install \
--with-fftw=install \
--with-elpa=install \
--with-cereal=install \
--with-rapidjson=install \
--with-libtorch=install \
--with-libnpy=install \
--with-libri=install \
--with-libcomm=install \
--with-4th-openmpi=no \
--enable-cuda \
--gpu-ver=120 \
| tee compile.log
# to use openmpi-version4: set --with-4th-openmpi=yes
# to enable gpu-lcao, add the following lines:
# --enable-cuda \
# --gpu-ver=75 \ 
# one should check your gpu compute capability number 
# and use it in --gpu-ver

and file build_abacus_gnu.sh:

#!/bin/bash
#SBATCH -J build
#SBATCH -N 1
#SBATCH -n 16
#SBATCH -o install.log
#SBATCH -e install.err
# JamesMisaka in 2025.03.09

# Build ABACUS by gnu-toolchain

# module load openmpi

ABACUS_DIR=..
TOOL=$(pwd)
INSTALL_DIR=$TOOL/install
source $INSTALL_DIR/setup
cd $ABACUS_DIR
ABACUS_DIR=$(pwd)

BUILD_DIR=build_abacus_gnu
rm -rf $BUILD_DIR

PREFIX=$ABACUS_DIR
LAPACK=$INSTALL_DIR/openblas-0.3.28/lib
SCALAPACK=$INSTALL_DIR/scalapack-2.2.1/lib
ELPA=$INSTALL_DIR/elpa-2025.01.001/nvidia # for gpu-lcao
FFTW3=$INSTALL_DIR/fftw-3.3.10
CEREAL=$INSTALL_DIR/cereal-1.3.2/include/cereal
LIBXC=$INSTALL_DIR/libxc-7.0.0
RAPIDJSON=$INSTALL_DIR/rapidjson-1.1.0/
LIBRI=$INSTALL_DIR/LibRI-0.2.1.0
LIBCOMM=$INSTALL_DIR/LibComm-0.1.1
LIBTORCH=$INSTALL_DIR/libtorch-2.1.2/share/cmake/Torch
LIBNPY=$INSTALL_DIR/libnpy-1.0.1/include
# DEEPMD=$HOME/apps/anaconda3/envs/deepmd # v3.0 might have problem

cmake -B $BUILD_DIR -DCMAKE_INSTALL_PREFIX=$PREFIX \
        -DCMAKE_CXX_COMPILER=g++ \
        -DMPI_CXX_COMPILER=mpicxx \
        -DLAPACK_DIR=$LAPACK \
        -DSCALAPACK_DIR=$SCALAPACK \
        -DELPA_DIR=$ELPA \
        -DFFTW3_DIR=$FFTW3 \
        -DCEREAL_INCLUDE_DIR=$CEREAL \
        -DLibxc_DIR=$LIBXC \
        -DENABLE_LCAO=ON \
        -DENABLE_LIBXC=ON \
        -DUSE_OPENMP=ON \
        -DUSE_ELPA=ON \
        -DENABLE_RAPIDJSON=ON \
        -DRapidJSON_DIR=$RAPIDJSON \
        -DUSE_CUDA=ON \
        -DENABLE_DEEPKS=1 \
        -DTorch_DIR=$LIBTORCH \
        -Dlibnpy_INCLUDE_DIR=$LIBNPY \
        -DENABLE_LIBRI=ON \
        -DLIBRI_DIR=$LIBRI \
        -DLIBCOMM_DIR=$LIBCOMM \
# 	      -DDeePMD_DIR=$DEEPMD \
        #-DENABLE_CUSOLVERMP=ON \
        #-D CAL_CUSOLVERMP_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/2x.xx/math_libs/1x.x/targets/x86_64-linux/lib

# # add mkl env for libtorch to link
# if one want to install libtorch, mkl should be load in build process
# for -lmkl when load libtorch
# module load mkl

# if one want's to include deepmd, your system gcc version should be >= 11.3.0 for glibc requirements

cmake --build $BUILD_DIR -j `nproc` 
cmake --install $BUILD_DIR 2>/dev/null

# generate abacus_env.sh
cat << EOF > "${TOOL}/abacus_env.sh"
#!/bin/bash
source $INSTALL_DIR/setup
export PATH="${PREFIX}/bin":\${PATH}
EOF

# generate information
cat << EOF
========================== usage =========================
Done!
To use the installed ABACUS version
You need to source ${TOOL}/abacus_env.sh first !
"""
EOF

Environment

OS: Ubuntu 24.04.2 LTS on Windows 10 x86_64 CPU: AMD Ryzen 9 7950X3D (32) @ 4.192GHz GPU: 2950:00:00.0 Microsoft Corporation Basic Render Driver (5090D on windows)

Additional Context

if the precision is set to double(by default) this program runs normally. Additionally, when I install abacus from conda-forge, single precision can run, but in the 10th or 11th loop, the RAM usage rises suddenly(normally 4G, but rises above RAM limit then in a few seconds), and killed by system

Task list for Issue attackers (only for developers)

[ ] Verify the issue is not a duplicate.
[ ] Describe the bug.
[ ] Steps to reproduce.
[ ] Expected behavior.
[ ] Error message.
[ ] Environment details.
[ ] Additional context.
[ ] Assign a priority level (low, medium, high, urgent).
[ ] Assign the issue to a team member.
[ ] Label the issue with relevant tags.
[ ] Identify possible related issues.
[ ] Create a unit test or automated test to reproduce the bug (if applicable).
[ ] Fix the bug.
[ ] Test the fix.
[ ] Update documentation (if necessary).
[ ] Close the issue and inform the reporter (if applicable).

Apr 04 '25 04:04 OutisLi