Best practices for using CvxpyLayer with GPU-based models
Hello cvxpylayers developers, first, thank you very much for developing and maintaining this powerful library! In practice, I have a key question about GPU usage that I'd like to ask: I have read your original NeurIPS 2019 paper ("Differentiable Convex Optimization Layers"), which mentions that the implementation at the time ran on the CPU. As the software has likely evolved significantly since then, I'd like to understand the current best practices. My entire model and training data will be on the GPU. If I insert a CvxpyLayer into this model:Can the current version of cvxpylayers seamlessly handle input tensors located on the GPU and offload the core solver computation to a GPU-accelerated solver? Or, do I still need to manually move the data from the GPU to the CPU (.to('cpu')) before calling the CvxpyLayer, and then manually move the result back to the GPU? Regarding this GPU-CPU data interaction, are there any known performance bottlenecks or recommended practices to be aware of?
Hey Autumn! We're in the middle of a huge refactor to address exactly this issue. I'll hopefully have a big announcement soon!
Hey Autumn! We're in the middle of a huge refactor to address exactly this issue. I'll hopefully have a big announcement soon!
Glad to hear that! Moreover, may I daringly ask about the expected time?
may I daringly ask about the expected time?
The goal is to announce it on December 7th, at 8:45am PDT in San Diego, CA at cvxgrp.org/scaleopt
The goal is to announce it on December 7th, at 8:45am PDT in San Diego, CA at cvxgrp.org/scaleopt
Glad to hear this news! Looking forward to the announcement on December 7th.
I'm very glad that this day has arrived. I would like to ask if there are detailed official documents for the new cvxpylayer, or if there have been any changes in its citation method compared to before? Looking forward to your reply!
We only managed to release the initial beta yesterday; docs are incoming!
Glad to hear
We only managed to release the initial beta yesterday; docs are incoming!
We only managed to release the initial beta yesterday; docs are incoming!
Thanks a lot for the great work! I have installed Julia, CuClarabel, diffqcp[gpu] and cupy-cuda12x, and installed cvxpylayers through the source. When I ran the example with solver=cp.CUCLARABEL enabled, I got quite some warnings followed by an exception. Really appreciate any guidance you can provide!
My environment is Debian12 + Python3.12 + cuda 12
uv pip show cvxpylayers lineax diffqcp
Name: cvxpylayers
Version: 1.0.0a0
Location: /.venv/lib/python3.12/site-packages
Requires: cvxpy, diffcp, numpy, scipy
Required-by:
Name: diffqcp
Version: 0.4.4
Location: /.venv/lib/python3.12/site-packages
Requires: equinox, jax, jaxtyping, lineax, numpy, scipy
Required-by:
Name: lineax
Version: 0.0.8
Location: /.venv/lib/python3.12/site-packages
Requires: equinox, jax, jaxtyping, typing-extensions
Required-by: diffqcp
┌ Warning: CUDA runtime library `cynvrtc.cpython-312-x86_64-linux-gnu.so` was loaded from a system path, `/.venv/lib/python3.12/site-packages/cuda/bindings/cynvrtc.cpython-312-x86_64-linux-gnu.so`.
│ This may cause errors.
│
│ If you're running under a profiler, this situation is expected. Otherwise,
│ ensure that your library path environment variable (e.g., `PATH` on Windows
│ or `LD_LIBRARY_PATH` on Linux) does not include CUDA library paths.
│
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/x8d2s/src/initialization.jl:216
Traceback (most recent call last):
File "/my.py", line 24, in <module>
solution.sum().backward()
File "/.venv/lib/python3.12/site-packages/torch/_tensor.py", line 625, in backward
torch.autograd.backward(
File "/.venv/lib/python3.12/site-packages/torch/autograd/__init__.py", line 354, in backward
_engine_run_backward(
File "/.venv/lib/python3.12/site-packages/torch/autograd/graph.py", line 841, in _engine_run_backward
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/torch/autograd/function.py", line 315, in apply
return user_fn(self, *args)
^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/torch/autograd/function.py", line 608, in wrapper
outputs = fn(ctx, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/cvxpylayers/torch/cvxpylayer.py", line 229, in backward
) = ctx.data.torch_derivative(primal, dual, ctx.backwards)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/cvxpylayers/interfaces/cuclarabel_if.py", line 474, in torch_derivative
dP_batch, dq_batch, dA_batch = _compute_gradients(
^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/cvxpylayers/interfaces/cuclarabel_if.py", line 381, in _compute_gradients
dP, dA, dq, db = _compute_vjp(vjps[i], dprimal[i], ddual[i], dslack)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/equinox/_jit.py", line 209, in __call__
return _call(self, False, args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/equinox/_jit.py", line 263, in _call
marker, _, _ = out = jit_wrapper._cached(
^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/cvxpylayers/interfaces/cuclarabel_if.py", line 377, in _compute_vjp
return qcp_module_instance.vjp(dx, dy, ds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/equinox/_module/_prebuilt.py", line 33, in __call__
return self.__func__(self.__self__, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/diffqcp/qcp.py", line 658, in vjp
return self._vjp_common(dx=dx, dy=dy, ds=ds, produce_output=partial_d_data_Q_adjoint_gpu, solve_method=solve_method)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/equinox/_module/_prebuilt.py", line 33, in __call__
return self.__func__(self.__self__, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/diffqcp/qcp.py", line 171, in _vjp_common
pi_z, F, dproj_kstar_v = self._form_atoms()
^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/equinox/_module/_prebuilt.py", line 33, in __call__
return self.__func__(self.__self__, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/diffqcp/qcp.py", line 73, in _form_atoms
F = (DzQ_pi_z @ dpi_z) - dpi_z + IdentityLinearOperator(eval_shape(lambda: pi_z))
~~~~~~~~~^~~~~~~
File "/.venv/lib/python3.12/site-packages/lineax/_operator.py", line 218, in __matmul__
return ComposedLinearOperator(self, other)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/.venv/lib/python3.12/site-packages/equinox/_module/_module.py", line 492, in __call__
check_init(self)
File "/.venv/lib/python3.12/site-packages/lineax/_operator.py", line 1128, in __check_init__
raise ValueError("Incompatible linear operator structures")
ValueError: Incompatible linear operator structures
--------------------
For simplicity, JAX has removed its internal frames from the traceback of the following exception. Set JAX_TRACEBACK_FILTERING=off to include these.
@healeyq3 might have better insights
@healeyq3 might have better insights
I'll attempt to reproduce today
Please forgive me for one more question. Previously, the cvxpylayer has always had the problem of parameter quantity limit. That is to say, if the parameter quantity is too large, it may cause loading difficulties or problems during the training process. So, will the current GPU computing version have improvements? Will completing large-scale matrix operations on the GPU have a higher tolerance for parameter quantity?
Yes! We are about to merge a new backend on the CVXPY end which will make the ability to handle huge numbers of parameters much faster.
@healeyq3 might have better insights
I'll attempt to reproduce today
Hey @healeyq3 , know you guys must be very busy merging new features. If you don't have time digging into this, would you mind sharing the package installation steps in order to run the demo "PyTorch on GPU with CuClarabel"?
I installed cvxpylayers by git clone XX cvxpylayers and pip install ./cvxpylayers with version 1.0.0b0. Still ran into the same exception.
@healeyq3 might have better insights
I'll attempt to reproduce today
Hey @healeyq3 , know you guys must be very busy merging new features. If you don't have time digging into this, would you mind sharing the package installation steps in order to run the demo "PyTorch on GPU with CuClarabel"?
I installed cvxpylayers by
git clone XX cvxpylayersandpip install ./cvxpylayerswith version 1.0.0b0. Still ran into the same exception.
Hello @alexa7b5 (and cc @autumnwt ),
Sorry for the delay in debugging the above error and creating a proper installation setup guide. I put a few hours toward this last weekend, but was running into some issues with Julia myself.
Starting this Saturday through January 5th I won't have work, so will spend some calories on CVXPYlayers (GPU) development.
Thank you for your interest in the project!
Hello developers, I have encountered a similar problem. I am trying to run the official example ("PyTorch on GPU with CuClarabel") using the CUCLARABEL interface on a PyTorch 2.x environment with CUDA 12. The forward pass works, but I encounter the following issues:
https://github.com/cvxpy/cvxpylayers/issues/173
/root/miniconda3/envs/opf-gpu/lib/python3.12/site-packages/cvxpylayers/torch/cvxpylayer.py:304: UserWarning: Sparse CSR tensor support is in beta state. If you miss a functionality in the sparse tensor support, please submit a feature request to https://github.com/pytorch/pytorch/issues. (Triggered internally at /pytorch/aten/src/ATen/SparseCsrTensorImpl.cpp:53.)
torch_csr = torch.sparse_csr_tensor(
┌ Warning: CUDA runtime library `libcublasLt.so.12` was loaded from a system path, `/root/miniconda3/envs/opf-gpu/lib/python3.12/site-packages/cutensor/lib/../../nvidia/cublas/lib/libcublasLt.so.12`.
│ This may cause errors.
│
│ If you're running under a profiler, this situation is expected. Otherwise,
│ ensure that your library path environment variable (e.g., `PATH` on Windows
│ or `LD_LIBRARY_PATH` on Linux) does not include CUDA library paths.
│
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/x8d2s/src/initialization.jl:216
┌ Warning: CUDA runtime library `nvrtc.cpython-312-x86_64-linux-gnu.so` was loaded from a system path, `/root/miniconda3/envs/opf-gpu/lib/python3.12/site-packages/cuda/bindings/nvrtc.cpython-312-x86_64-linux-gnu.so`.
│ This may cause errors.
│
│ If you're running under a profiler, this situation is expected. Otherwise,
│ ensure that your library path environment variable (e.g., `PATH` on Windows
│ or `LD_LIBRARY_PATH` on Linux) does not include CUDA library paths.
│
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/x8d2s/src/initialization.jl:216
┌ Warning: CUDA runtime library `cynvrtc.cpython-312-x86_64-linux-gnu.so` was loaded from a system path, `/root/miniconda3/envs/opf-gpu/lib/python3.12/site-packages/cuda/bindings/cynvrtc.cpython-312-x86_64-linux-gnu.so`.
│ This may cause errors.
│
│ If you're running under a profiler, this situation is expected. Otherwise,
│ ensure that your library path environment variable (e.g., `PATH` on Windows
│ or `LD_LIBRARY_PATH` on Linux) does not include CUDA library paths.
│
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/x8d2s/src/initialization.jl:216
┌ Warning: CUDA runtime library `cynvrtc.cpython-312-x86_64-linux-gnu.so` was loaded from a system path, `/root/miniconda3/envs/opf-gpu/lib/python3.12/site-packages/cuda/bindings/_bindings/cynvrtc.cpython-312-x86_64-linux-gnu.so`.
│ This may cause errors.
│
│ If you're running under a profiler, this situation is expected. Otherwise,
│ ensure that your library path environment variable (e.g., `PATH` on Windows
│ or `LD_LIBRARY_PATH` on Linux) does not include CUDA library paths.
│
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/x8d2s/src/initialization.jl:216
Traceback (most recent call last):
File "/opt/data/private/E2e-OPF-GPU/test_environ/test_torch_cuclarabel_0.py", line 38, in <module>
solution.sum().backward()
File "/root/miniconda3/envs/opf-gpu/lib/python3.12/site-packages/torch/_tensor.py", line 626, in backward
torch.autograd.backward(
File "/root/miniconda3/envs/opf-gpu/lib/python3.12/site-packages/torch/autograd/__init__.py", line 347, in backward
_engine_run_backward(
File "/root/miniconda3/envs/opf-gpu/lib/python3.12/site-packages/torch/autograd/graph.py", line 823, in _engine_run_backward
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: function _CvxpyLayerBackward returned a gradient different than None at position 1, but the corresponding forward input was not a Variable
Environment:
- Python 3.12, PyTorch 2.6.1
- CUDA 12
- JAX 0.8.2
- Julia 1.10.10
- CVXPYLayers installed from the latest release
I successfully computed some gradients via the CuClarabel backend and started adding a setup guide to the docs, but unfortunately when I tried the setup steps on a different machine I once again ran into the issues I saw ~ 1.5 weeks ago on a third machine. I opened this issue in CuClarabel with steps to reproduce my success and failure. I'm inclined to believe our various failures all stem from setup issues.