hdf5 icon indicating copy to clipboard operation
hdf5 copied to clipboard

Cannot run Fortran test program while cross compiling

Open hurricane642 opened this issue 1 year ago • 8 comments

Describe the bug Dear developers, I am trying to use NVIDIA compilers to build Docker image on 2 platforms - amd64 and arm64. The build is done on Apple Silicon (arm64):

# Use nvidia/nvhpc as base image
FROM nvcr.io/nvidia/nvhpc:23.7-devel-cuda12.2-ubuntu22.04

# Install of utilities and libraries
RUN apt-get update && apt-get install -y \
    build-essential \
    wget \
    unzip \
    vim \
    curl

# Install of Git, OpenSSH, python, pip and make
RUN apt-get update && apt-get install -y git openssh-client make  python3-pip

#install hdf5
WORKDIR /opt
RUN wget https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.14/hdf5-1.14.2/src/hdf5-1.14.2.tar.gz && \
    tar -xzf hdf5-1.14.2.tar.gz && \
    cd hdf5-1.14.2 && \
    CC=nvc CXX=nvc++ FC=nvfortran F90=nvfortran ./configure \
    --prefix=/opt/hdf5 --enable-fortran --enable-shared \
    CFLAGS="-O3 -fPIC" FFLAGS="-O3 -fPIC" CXXFLAGS="-O3 -fPIC" FCFLAGS="-O3 -fPIC" --host=x86_64-linux-gnu && \
    make -j 8 && \
    make install && \
    rm ../hdf5-1.14.2.tar.gz

As a result, I get the following error during compilation:

checking maximum decimal precision for C... configure: error: in `/opt/hdf5-1.14.2':
configure: error: cannot run test program while cross compiling
See `config.log' for more details

The error occurs only for the amd64 platform, everything works fine for arm64. As far as I know, there is currently work being done to make HDF5 cross-compilation easier (for example #3104 ), so maybe you would have some useful information regarding my question? Any help would be appreciated!

Platform (please complete the following information)

  • HDF5 version (if building from a maintenance branch, please include the commit hash): hdf-1.14.2
  • OS and version: ubuntu22.04
  • Compiler and version: nvc, nvfortran, nvc++, version 23.7
  • Docker command: docker buildx build -f Dockerfile_nvfortran --platform linux/arm64,linux/amd64 --no-cache -t usr/pkg:nvfortran --push .

hurricane642 avatar Aug 23 '23 21:08 hurricane642

@hurricane642 , thank you for trying the latest NVIDIA compilers!

I tried them today and made them work for ubuntu GitHub Actions using this workflow. Please pay attention to the configure line. (e.g., there is noCFLAGS=-fPIC)

I hope this helps you to debug the cross-compilation issue.

hyoklee avatar Aug 26 '23 00:08 hyoklee

@hurricane642 , thank you for trying the latest NVIDIA compilers!

I tried them today and made them work for ubuntu GitHub Actions using this workflow. Please pay attention to the configure line. (e.g., there is noCFLAGS=-fPIC)

I hope this helps you to debug the cross-compilation issue.

Dear @hyoklee , Thanks a lot for your suggestion!

Unfortunately, it doesn't quite fit my case. The problem is that I need to use the nvc compiler (I need to compile code that will run on CPU). As far as I understand from https://docs.nvidia.com/hpc-sdk//index.html nvcc is designed for GPU running. Even with this, I tried to run compilation with nvcc, however, I got an error:

2124.8 checking for Fortran name-mangling scheme... lower case, underscore, no extra underscore
2131.7 checking if Fortran compiler supports intrinsic SIZEOF... yes
2133.3 checking if Fortran compiler supports intrinsic C_SIZEOF... yes
2134.9 checking if Fortran compiler supports intrinsic STORAGE_SIZE... yes
2136.5 checking if Fortran compiler supports intrinsic module ISO_FORTRAN_ENV... yes
2140.7 Error
2140.7 configure: error: Failed to run Fortran program to determine available KINDs

I haven't fully understood what is causing this error, but I risk guessing that during compilation it is trying to identify the CUDA of the device it is compiling on, and since I am running on an Apple Silicon, it is not finding it.

Could you please advise what can be done in this case?

hurricane642 avatar Aug 28 '23 23:08 hurricane642

@hurricane642 , thank you for your link to NVIDIA website and for mentioning nvc! I made nvc work for both Fortran and Parallel. Please see my PR #3509. I hope my PR can help you to build & test HDF5 for arm64 using NVHPC.

hyoklee avatar Sep 06 '23 03:09 hyoklee

The cross-compile changes we've made to the library are mainly for the C library. Fortran still runs programs at configure time. I'll see if we can engineer around that in time for 1.14.3.

In the meantime, can you check if the C library builds for you without issues? Any cross-compile data points we can get are very useful.

derobins avatar Oct 14 '23 21:10 derobins

Hey, I apologize for not responding for so long. I'll be ready to test version 14.3 when it becomes available at https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.14/ I checked compilation with no -enable-fortran option as a result:

  20 |     WORKDIR /opt
  21 | >>> RUN wget https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.14/hdf5-1.14.2/src/hdf5-1.14.2.tar.gz && \
  22 | >>>     tar -xzf hdf5-1.14.2.tar.gz && \
  23 | >>>     cd hdf5-1.14.2 && \
  24 | >>>     ./autogen.sh && \
  25 | >>>     CC=nvc CXX=nvc++ FC=nvfortran ./configure \
  26 | >>>     --prefix=/opt/hdf5 --enable-shared \
  27 | >>>     CFLAGS=-O3 FFLAGS=-O3 CXXFLAGS=-O3 FCFLAGS="-O3 -fPIC" && \
  28 | >>>     cat config.log && \
  29 | >>>     make -j && \
  30 | >>>     make install && \
  31 | >>>     rm ../hdf5-1.14.2.tar.gz
107.2 checking for config ./config/site-specific/host-buildkitsandbox... no
107.2 checking for clang sanitizer checks... checking build mode... production
107.2 checking for gcc... nvc
108.8 checking whether the C compiler works... yes
110.3 checking for C compiler default output file name... a.out
110.3 checking for suffix of executables... 
111.7 checking whether we are cross compiling... configure: error: in `/opt/hdf5-1.14.2':
113.4 configure: error: cannot run C compiled programs.
113.4 If you meant to cross compile, use `--host'.
113.4 See `config.log' for more details

hurricane642 avatar Oct 16 '23 18:10 hurricane642

Can you try again specifying --host= and let us know if that works? There should be cross-compile defaults for the Autotools.

https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Hosts-and-Cross_002dCompilation.html

derobins avatar Oct 17 '23 16:10 derobins

Moving this to 1.14.4. Making the Fortran wrappers work with cross-compiling is going to take some work. There's a lot of cross-compile improvements we should make between now and spring 2024, especially in CMake.

derobins avatar Oct 17 '23 19:10 derobins

Can you try again specifying --host= and let us know if that works? There should be cross-compile defaults for the Autotools.

https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Hosts-and-Cross_002dCompilation.html

Hey, I've tried to define it in this way: --host=aarch64-unknown-linux-gnu (maybe the host defined wrong, not sure), but as a result I obtain this:

 => [linux/amd64 5/5] RUN wget https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.14/hdf5-1.14.2/src/hdf5-1.14.2.tar.gz &&     tar -xzf hdf5-1.14.2.tar  872.1s 
 => => #   CC       H5VLnative_blob.lo                                                                                                                             
 => => #   CC       H5Z.lo                                                                                                                                         
 => => #   CC       H5Zszip.lo                                                                                                                                     
 => => #   CC       H5Ztrans.lo                                                                                                                                    
 => => #   CC       H5Zdeflate.lo                                                                                                                                  
 => => #   CC       H5VLnative_object.lo                                                                                                                           
ERROR: failed to receive status: rpc error: code = Unavailable desc = error reading from server: EOF      

I've reruned it's seveal times - result the same. Without this option error is different - the same as it was initially

hurricane642 avatar Oct 18 '23 22:10 hurricane642

Hi, @hurricane642 , I think it's Docker + macos arm64 issue, not HDF5.

I tested the latest HDF5 release with nvhpc 24.7. Your configure option worked fine:

https://github.com/hyoklee/actions/actions/runs/10529181148/workflow

May I close the ticket?

hyoklee avatar Aug 23 '24 16:08 hyoklee

Hi, @hurricane642 , I think it's Docker + macos arm64 issue, not HDF5.

I tested the latest HDF5 release with nvhpc 24.7. Your configure option worked fine:

https://github.com/hyoklee/actions/actions/runs/10529181148/workflow

May I close the ticket?

Hm, ok, I'll try to dig into it a little more. Yes, we can close for now, thank you very much! If I'll meet any related problem, I'll open a new issue!

hurricane642 avatar Aug 23 '24 17:08 hurricane642