pennylane-lightning-gpu
pennylane-lightning-gpu copied to clipboard
docker build doesn't work out of the box
Issue description
I did a vanilla clone of the repo, and ran docker build . -f ./docker/Dockerfile -t "lightning-gpu-wheels"
. But failed with the following error:
Source code and tracebacks
#0 15.72 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
#0 15.96 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
#0 15.96 -- Looking for pthread_create in pthreads
#0 16.17 -- Looking for pthread_create in pthreads - not found
#0 16.17 -- Looking for pthread_create in pthread
#0 16.41 -- Looking for pthread_create in pthread - found
#0 16.41 -- Found Threads: TRUE
#0 16.42 -- Found CUDA: /usr/local/cuda (found version "12.2")
#0 16.43 -- Found CUDAToolkit: /usr/local/cuda/include (found version "12.2.128")
#0 17.05 -- Could NOT find Python (missing: Python_INCLUDE_DIRS Python_LIBRARIES Development Development.Module Development.Embed) (found version "2.7.5")
#0 17.05 CMake Error at CMakeLists.txt:176 (message):
#0 17.05
#0 17.05
#0 17.05 Unable to find cuQuantum SDK installation. Please ensure it is correctly
#0 17.05 installed and available on path.
#0 17.05
#0 17.05
#0 17.05 -- Configuring incomplete, errors occurred!
#0 17.07 /pennylane-lightning-gpu/pyenv3.8/lib/python3.8/site-packages/setuptools/dist.py:463: UserWarning: Normalizing '0.32.0-dev' to '0.32.0.dev0'
#0 17.07 warnings.warn(tmpl.format(**locals()))
#0 17.07 Traceback (most recent call last):
#0 17.07 File "setup.py", line 143, in <module>
#0 17.07 setup(classifiers=classifiers, **(info))
#0 17.07 File "/pennylane-lightning-gpu/pyenv3.8/lib/python3.8/site-packages/setuptools/__init__.py", line 153, in setup
#0 17.07 return distutils.core.setup(**attrs)
#0 17.07 File "/opt/_internal/cpython-3.8.17/lib/python3.8/distutils/core.py", line 148, in setup
#0 17.07 dist.run_commands()
#0 17.07 File "/opt/_internal/cpython-3.8.17/lib/python3.8/distutils/dist.py", line 966, in run_commands
#0 17.07 self.run_command(cmd)
#0 17.07 File "/opt/_internal/cpython-3.8.17/lib/python3.8/distutils/dist.py", line 985, in run_command
#0 17.07 cmd_obj.run()
#0 17.07 File "/pennylane-lightning-gpu/pyenv3.8/lib/python3.8/site-packages/setuptools/command/build_ext.py", line 79, in run
#0 17.07 _build_ext.run(self)
#0 17.07 File "/opt/_internal/cpython-3.8.17/lib/python3.8/distutils/command/build_ext.py", line 340, in run
#0 17.07 self.build_extensions()
#0 17.07 File "/opt/_internal/cpython-3.8.17/lib/python3.8/distutils/command/build_ext.py", line 449, in build_extensions
#0 17.07 self._build_extensions_serial()
#0 17.07 File "/opt/_internal/cpython-3.8.17/lib/python3.8/distutils/command/build_ext.py", line 474, in _build_extensions_serial
#0 17.07 self.build_extension(ext)
#0 17.07 File "setup.py", line 85, in build_extension
#0 17.07 subprocess.check_call(
#0 17.07 File "/opt/_internal/cpython-3.8.17/lib/python3.8/subprocess.py", line 364, in check_call
#0 17.07 raise CalledProcessError(retcode, cmd)
#0 17.07 subprocess.CalledProcessError: Command '['cmake', '/pennylane-lightning-gpu', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/pennylane-lightning-gpu/build/lib.linux-x86_64-3.8/pennylane_lightning_gpu', '-DPYTHON_EXECUTABLE=/pennylane-lightning-gpu/pyenv3.8/bin/python3', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-GNinja', '-DCMAKE_MAKE_PROGRAM=/pennylane-lightning-gpu/pyenv3.8/bin/ninja', '-DENABLE_OPENMP=OFF', '-DENABLE_CLANG_TIDY=0']' returned non-zero exit status 1.
Thanks for the note, @rht ! We'll check it out and get back to you soon.
I fixed the cuquantum
not found by removing --no-deps
and instead doing pip install cuquantum
in https://github.com/PennyLaneAI/pennylane-lightning-gpu/blob/1e129b2e7dbc7d885b16da61b6c5b1a02e45970d/docker/Dockerfile#L18.
However, subsequently, I encountered lots of compile error, for example
#0 101.4 /pennylane-lightning-gpu/pennylane_lightning_gpu/src/algorithms/AdjointDiffGPU.hpp:370:76: error: could not convert ‘{<expression error>, <expression error>, <expression error>, <expression error>, <expression error>}’ from ‘<brace-enclosed initializer list>’ to ‘Pennylane::Pennylane::Algorithms::OpsData<double>’
#0 101.4 370 | return {ops_name, ops_params, ops_wires, ops_inverses, ops_matrices};
#0 101.4 | ^
#0 101.4 | |
#0 101.4 | <brace-enclosed initializer list>
#0 101.4 /pennylane-lightning-gpu/pennylane_lightning_gpu/src/algorithms/AdjointDiffGPU.hpp: In instantiation of ‘void Pennylane::Pennylane::Algorithms::AdjointJacobianGPU<T>::batchAdjointJacobian(const CFP_t*, int, int) [with T = double; Pennylane::Pennylane::Algorithms::AdjointJacobianGPU<T>::CFP_t = double2]’:
#0 101.4 /pennylane-lightning-gpu/pennylane_lightning_gpu/src/algorithms/AdjointDiffGPU.cpp:5:39: required from here
#0 101.4 /pennylane-lightning-gpu/pennylane_lightning_gpu/src/algorithms/AdjointDiffGPU.hpp:441:66: error: ‘jac_local’ was not declared in this scope; did you mean ‘dt_local’?
I additionally had to specify Python_SITELIB
to point to the virtualenv site-packages path.
Hi @rht
Thanks for posting. We haven't been using the docker builder process for some time as we run our own custom AMIs now through Github Actions (https://github.com/PennyLaneAI/pennylane-lightning-gpu/blob/main/.github/workflows/build_wheel_manylinux2014.yml). We will need some time to investigate what changes are needed to get this process back working, but I suspect the issue is a combination of compiler versions, changing dependencies, and updated C++ language features.
Yeah, I managed to make it work by consulting the GH Actions yml file. One difference is that manylinux2014
uses Red Hat Toolset 10, which works with the Docker image, but this is different from the GH Actions file, which uses g++-11
and gcc-11
.
My changes in the CMakelists.txt:
- used
find_package (Python3 COMPONENTS Interpreter Development.Module)
(i.e. Python3 instead of Python, Development.Module instead of Development) - added
set(Python_SITELIB /pennylane-lightning-gpu/pyenv3.8/lib/python3.8/site-packages)
(hardcoded to 3.8 because I wanted quick result ASAP)
In the Dockerfile, I replaced yum -y install cuda
with yum -y install cuda-11-5
.
With those changes, everything should work.
Hi @rht, I'm glad you managed to make it work! Thank you for sharing your solution here. Please let us know if you encounter any further issues.