qiskit-aer
qiskit-aer copied to clipboard
Problem with Qiskit Aer parallelization using GPUs
Versions:
qiskit-aer: 0.11.0 qiskit-terra: 0.21.0 mpirun (Open MPI): 4.0.3 python: 3.8.10
Description:
Hi, I'm trying to replicate the code example in the Qiskit Aer documentation (distributing the Quantum Volume algorithm using MPI and GPUs) as seen here: Running-with-multiple-gpus-andor-multiple-nodes
Code:
This is the code I'm running:
import qiskit from qiskit import IBMQ from qiskit.providers.aer import AerSimulator from qiskit import transpile from qiskit import execute, QuantumCircuit from qiskit.circuit.library import QuantumVolume
qubit=24 sim = AerSimulator(method='statevector', device='GPU') circ = transpile(QuantumVolume(qubit, 10, seed = 0)) circ.measure_all() result = execute(circ, sim, shots=100, blocking_enable=True, blocking_qubits=23).result()
print(result)
Error
This is the error I get:
Read -1, expected 67108864, errno = 14 *** Process received signal *** Signal: Segmentation fault (11) Signal code: Invalid permissions (2) Failing at address: 0x7f7988000000 Read -1, expected 67108864, errno = 14 *** Process received signal *** Signal: Segmentation fault (11) Signal code: Invalid permissions (2) Failing at address: 0x7fadb4000000 [ 0] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x43090)[0x7f79f420d090] [ 1] /lib/x86_64-linux-gnu/libc.so.6(+0x43090)[0x7fae1d7c2090] [ 1] /lib/x86_64-linux-gnu/libc.so.6(+0x18b8f5)[0x7f79f43558f5] [ 2] /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_btl_vader.so(+0x31c4)[0x7f79e26531c4] [ 3] /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pml_ob1.so(mca_pml_ob1_send_request_schedule_once+0x1c6)[0x7f79e2635926] [ 4] /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pml_ob1.so(mca_pml_ob1_recv_frag_callback_ack+0x1a9)[0x7f79e262e429] [ 5] /lib/x86_64-linux-gnu/libc.so.6(+0x18b8f5)[0x7fae1d90a8f5] [ 2] /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_btl_vader.so(+0x31c4)[0x7fae0c0481c4] [ 3] /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pml_ob1.so(mca_pml_ob1_send_request_schedule_once+0x1c6)[0x7fae0c02a926] [ 4] /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_pml_ob1.so(mca_pml_ob1_recv_frag_callback_ack+0x1a9)[0x7fae0c023429] /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_btl_vader.so(mca_btl_vader_poll_handle_frag+0x95)[0x7f79e2654ed5] [ 6] /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_btl_vader.so(+0x53a3)[0x7f79e26553a3] [ 5] /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_btl_vader.so(mca_btl_vader_poll_handle_frag+0x95)[0x7fae0c049ed5] [ 6] /usr/lib/x86_64-linux-gnu/openmpi/lib/openmpi3/mca_btl_vader.so(+0x53a3)[0x7fae0c04a3a3] [ 7] /lib/x86_64-linux-gnu/libopen-pal.so.40(opal_progress+0x34)[0x7fae0ef98854] [ 8] /lib/x86_64-linux-gnu/libopen-pal.so.40(ompi_sync_wait_mt+0xb5)[0x7fae0ef9f315] [ 9] /lib/x86_64-linux-gnu/libmpi.so.40(ompi_request_default_wait+0x228)[0x7fae0f42f9f8] [10] [ 7] /lib/x86_64-linux-gnu/libopen-pal.so.40(opal_progress+0x34)[0x7f79e59e3854] [ 8] /lib/x86_64-linux-gnu/libmpi.so.40(PMPI_Wait+0x58)[0x7fae0f472a88] [11] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x271e89)[0x7fae11905e89] [12] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x27019d)[0x7fae1190419d] [13] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0xe4bb9)[0x7fae11778bb9] [14] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x4343dc)[0x7fae11ac83dc] [15] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x4362c4)[0x7fae11aca2c4] /lib/x86_64-linux-gnu/libopen-pal.so.40(ompi_sync_wait_mt+0xb5)[0x7f79e59ea315] [ 9] [16] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x436ce9)[0x7fae11acace9] /lib/x86_64-linux-gnu/libmpi.so.40(ompi_request_default_wait+0x228)[0x7f79e5e7a9f8] [10] /lib/x86_64-linux-gnu/libmpi.so.40(PMPI_Wait+0x58)[0x7f79e5ebda88] [11] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x271e89)[0x7f79e8350e89] [12] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x27019d)[0x7f79e834f19d] [13] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0xe4bb9)[0x7f79e81c3bb9] [14] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x4343dc)[0x7f79e85133dc] [15] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x4362c4)[0x7f79e85152c4] [16] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x436ce9)[0x7f79e8515ce9] [17] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0xe5cc1)[0x7f79e81c4cc1] [18] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x43dbce)[0x7f79e851cbce] [17] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0xe5cc1)[0x7fae11779cc1] [18] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x43dbce)[0x7fae11ad1bce] [19] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x43f420)[0x7fae11ad3420] [20] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x43f6f2)[0x7fae11ad36f2] [21] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x1a280d)[0x7fae1183680d] [22] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x1a2f44)[0x7fae11836f44] [23] python(PyCFunction_Call+0x59)[0x5f3989] [24] [19] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x43f420)[0x7f79e851e420] [20] python(_PyObject_MakeTpCall+0x29e)[0x5f3e1e] [25] python[0x50b158] [26] python(PyObject_Call+0x1f7)[0x5f3547] [27] python[0x59d13c] [28] python(_PyObject_MakeTpCall+0x29e)[0x5f3e1e] [29] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x43f6f2)[0x7f79e851e6f2] [21] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x1a280d)[0x7f79e828180d] [22] /home/ubuntu/.local/lib/python3.8/site-packages/qiskit/providers/aer/backends/controller_wrappers.cpython-38-x86_64-linux-gnu.so(+0x1a2f44)[0x7f79e8281f44] [23] python(PyCFunction_Call+0x59)[0x5f3989] [24] python(_PyObject_MakeTpCall+0x29e)[0x5f3e1e] [25] python[0x50b158] python(_PyEval_EvalFrameDefault+0x58e6)[0x570266] *** End of error message *** [26] python(PyObject_Call+0x1f7)[0x5f3547] [27] python[0x59d13c] [28] python(_PyObject_MakeTpCall+0x29e)[0x5f3e1e] [29] python(_PyEval_EvalFrameDefault+0x58e6)[0x570266] *** End of error message ***
Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.
mpirun noticed that process rank 1 with PID 0 on node ip-XX exited on signal 11 (Segmentation fault).
Things I've tried:
I tried creating a simple circuit to test the parallelization. It seems that the segmentation error happens when one puts a gate on the last qubit. Example: on a 24 qubit circuit, if I put a gate (like Hadamard) on the last qubit (qc.h(23)), I get a segmentation error. The other qubits seem unaffected, I can put arbitrary gates in the other qubits and it works.
Thanks a lot!!
@jakelishman @doichanj I've tried different configurations with MPI and CUDA, and it seems to me the problem is in Qiskit Aer. Any algorithm fails to distribute with GPUs if the circuit's last qubit performs any operation (any gate on the last qubit seems to make the simulation fail). Can you fix this issue or maybe point out something I'm doing wrong? Thanks a lot!!
I could not reproduce this issue. Please provide more info (number of processes, number of GPUs, GPU and CPU memory size, etc.)
Could you test with smaller blocking_qubits value ? I think blocking_qubits=23 is too large for 24-qubits circuit, i.e. if you use 4 processes blocking_qubits should be less or equal to 22. (If you set 23 for 4 process, Qiskit Aer will abort with message like ERROR: [Experiment 0] cache blocking : blocking_qubits is to large to parallelize with 4 processes)
Hi! I'm using g5.xlarge instances on AWS:
- 4 vCPUs (AMD EPYC 7R32)
- NVIDIA A10G Tensor Core (24 GB)
- 16 GB RAM
Just in case there is some error in the way I'm building Qiskit Aer:
I've installed CUDA 11.7 following these instructions and built Qiskit Aer using
python ./setup.py bdist_wheel -- -DAER_MPI=True -DAER_THRUST_BACKEND=CUDA
I've tried lowering the blocking_qubits but it doesn´t seem to make any difference, I get the same segmentation error as in the other comment.
Thanks a lot!!
In case it helps, these are all the steps I follow on a new AWS machine to install Qiskit Aer with GPU support:
NVIDIA toolkit installation
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600 wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda-repo-ubuntu2004-11-7-local_11.7.0-515.43.04-1_amd64.deb sudo dpkg -i cuda-repo-ubuntu2004-11-7-local_11.7.0-515.43.04-1_amd64.deb sudo cp /var/cuda-repo-ubuntu2004-11-7-local/cuda-*-keyring.gpg /usr/share/keyrings/ sudo apt-get update sudo apt-get -y install cuda
Qiskit AER compilation
sudo apt -y install build-essential libopenblas-dev git openmpi-bin python3-pip python-is-python3 git clone https://github.com/Qiskit/qiskit-aer cd qiskit-aer export PATH="/home/ubuntu/.local/bin:$PATH" pip install -r requirements-dev.txt source ~/.bashrc export CUDACXX=/usr/local/cuda-11.7/bin/nvcc python ./setup.py bdist_wheel -- -DAER_MPI=True -DAER_THRUST_BACKEND=CUDA pip install -U dist/qiskit_aer*.whl
Here is additional information on the GPU:

Thanks a lot!! Tell me if you need more information :)
I think this issue is same as issue #1583 I could not reproduce this one.
Let me close this issue because of no response in more than two weeks. Please create a new issue when this issue should be fixed in your environment.