cvxpylayers Best practices for using CvxpyLayer with GPU-based models

Hello cvxpylayers developers, first, thank you very much for developing and maintaining this powerful library! In practice, I have a key question about GPU usage that I'd like to ask: I have read your original NeurIPS 2019 paper ("Differentiable Convex Optimization Layers"), which mentions that the implementation at the time ran on the CPU. As the software has likely evolved significantly since then, I'd like to understand the current best practices. My entire model and training data will be on the GPU. If I insert a CvxpyLayer into this model:Can the current version of cvxpylayers seamlessly handle input tensors located on the GPU and offload the core solver computation to a GPU-accelerated solver? Or, do I still need to manually move the data from the GPU to the CPU (.to('cpu')) before calling the CvxpyLayer, and then manually move the result back to the GPU? Regarding this GPU-CPU data interaction, are there any known performance bottlenecks or recommended practices to be aware of?

Sep 22 '25 14:09 autumnwt

Hey Autumn! We're in the middle of a huge refactor to address exactly this issue. I'll hopefully have a big announcement soon!

Sep 25 '25 17:09 PTNobel

Hey Autumn! We're in the middle of a huge refactor to address exactly this issue. I'll hopefully have a big announcement soon!

Glad to hear that! Moreover, may I daringly ask about the expected time?

Oct 04 '25 14:10 autumnwt

may I daringly ask about the expected time?

The goal is to announce it on December 7th, at 8:45am PDT in San Diego, CA at cvxgrp.org/scaleopt

Oct 06 '25 21:10 PTNobel

The goal is to announce it on December 7th, at 8:45am PDT in San Diego, CA at cvxgrp.org/scaleopt

Glad to hear this news! Looking forward to the announcement on December 7th.

Oct 09 '25 11:10 autumnwt

I'm very glad that this day has arrived. I would like to ask if there are detailed official documents for the new cvxpylayer, or if there have been any changes in its citation method compared to before? Looking forward to your reply!

Dec 08 '25 06:12 autumnwt

We only managed to release the initial beta yesterday; docs are incoming!

Dec 08 '25 18:12 PTNobel

Glad to hear

We only managed to release the initial beta yesterday; docs are incoming!

Dec 10 '25 06:12 autumnwt

We only managed to release the initial beta yesterday; docs are incoming!

Thanks a lot for the great work! I have installed Julia, CuClarabel, diffqcp[gpu] and cupy-cuda12x, and installed cvxpylayers through the source. When I ran the example with solver=cp.CUCLARABEL enabled, I got quite some warnings followed by an exception. Really appreciate any guidance you can provide!

My environment is Debian12 + Python3.12 + cuda 12

uv pip show cvxpylayers lineax diffqcp
Name: cvxpylayers
Version: 1.0.0a0
Location: /.venv/lib/python3.12/site-packages
Requires: cvxpy, diffcp, numpy, scipy
Required-by:

Name: diffqcp
Version: 0.4.4
Location: /.venv/lib/python3.12/site-packages
Requires: equinox, jax, jaxtyping, lineax, numpy, scipy
Required-by:

Name: lineax
Version: 0.0.8
Location: /.venv/lib/python3.12/site-packages
Requires: equinox, jax, jaxtyping, typing-extensions
Required-by: diffqcp

┌ Warning: CUDA runtime library `cynvrtc.cpython-312-x86_64-linux-gnu.so` was loaded from a system path, `/.venv/lib/python3.12/site-packages/cuda/bindings/cynvrtc.cpython-312-x86_64-linux-gnu.so`.
│ This may cause errors.
│
│ If you're running under a profiler, this situation is expected. Otherwise,
│ ensure that your library path environment variable (e.g., `PATH` on Windows
│ or `LD_LIBRARY_PATH` on Linux) does not include CUDA library paths.
│
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/x8d2s/src/initialization.jl:216

Traceback (most recent call last):
  File "/my.py", line 24, in <module>
    solution.sum().backward()
  File "/.venv/lib/python3.12/site-packages/torch/_tensor.py", line 625, in backward
    torch.autograd.backward(
  File "/.venv/lib/python3.12/site-packages/torch/autograd/__init__.py", line 354, in backward
    _engine_run_backward(
  File "/.venv/lib/python3.12/site-packages/torch/autograd/graph.py", line 841, in _engine_run_backward
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/torch/autograd/function.py", line 315, in apply
    return user_fn(self, *args)
           ^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/torch/autograd/function.py", line 608, in wrapper
    outputs = fn(ctx, *args, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/cvxpylayers/torch/cvxpylayer.py", line 229, in backward
    ) = ctx.data.torch_derivative(primal, dual, ctx.backwards)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/cvxpylayers/interfaces/cuclarabel_if.py", line 474, in torch_derivative
    dP_batch, dq_batch, dA_batch = _compute_gradients(
                                   ^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/cvxpylayers/interfaces/cuclarabel_if.py", line 381, in _compute_gradients
    dP, dA, dq, db = _compute_vjp(vjps[i], dprimal[i], ddual[i], dslack)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/equinox/_jit.py", line 209, in __call__
    return _call(self, False, args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/equinox/_jit.py", line 263, in _call
    marker, _, _ = out = jit_wrapper._cached(
                         ^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/cvxpylayers/interfaces/cuclarabel_if.py", line 377, in _compute_vjp
    return qcp_module_instance.vjp(dx, dy, ds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/equinox/_module/_prebuilt.py", line 33, in __call__
    return self.__func__(self.__self__, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/diffqcp/qcp.py", line 658, in vjp
    return self._vjp_common(dx=dx, dy=dy, ds=ds, produce_output=partial_d_data_Q_adjoint_gpu, solve_method=solve_method)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/equinox/_module/_prebuilt.py", line 33, in __call__
    return self.__func__(self.__self__, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/diffqcp/qcp.py", line 171, in _vjp_common
    pi_z, F, dproj_kstar_v = self._form_atoms()
                             ^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/equinox/_module/_prebuilt.py", line 33, in __call__
    return self.__func__(self.__self__, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/diffqcp/qcp.py", line 73, in _form_atoms
    F = (DzQ_pi_z @ dpi_z) - dpi_z + IdentityLinearOperator(eval_shape(lambda: pi_z))
         ~~~~~~~~~^~~~~~~
  File "/.venv/lib/python3.12/site-packages/lineax/_operator.py", line 218, in __matmul__
    return ComposedLinearOperator(self, other)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/equinox/_module/_module.py", line 492, in __call__
    check_init(self)
  File "/.venv/lib/python3.12/site-packages/lineax/_operator.py", line 1128, in __check_init__
    raise ValueError("Incompatible linear operator structures")
ValueError: Incompatible linear operator structures
--------------------
For simplicity, JAX has removed its internal frames from the traceback of the following exception. Set JAX_TRACEBACK_FILTERING=off to include these.

Dec 11 '25 06:12 alexa7b5

@healeyq3 might have better insights

Dec 11 '25 07:12 PTNobel

@healeyq3 might have better insights

I'll attempt to reproduce today

Dec 11 '25 13:12 healeyq3

Please forgive me for one more question. Previously, the cvxpylayer has always had the problem of parameter quantity limit. That is to say, if the parameter quantity is too large, it may cause loading difficulties or problems during the training process. So, will the current GPU computing version have improvements? Will completing large-scale matrix operations on the GPU have a higher tolerance for parameter quantity?

Dec 14 '25 13:12 autumnwt

Yes! We are about to merge a new backend on the CVXPY end which will make the ability to handle huge numbers of parameters much faster.

Dec 15 '25 06:12 PTNobel

@healeyq3 might have better insights

I'll attempt to reproduce today

Hey @healeyq3 , know you guys must be very busy merging new features. If you don't have time digging into this, would you mind sharing the package installation steps in order to run the demo "PyTorch on GPU with CuClarabel"?

I installed cvxpylayers by git clone XX cvxpylayers and pip install ./cvxpylayers with version 1.0.0b0. Still ran into the same exception.

Dec 18 '25 08:12 alexa7b5

@healeyq3 might have better insights

I'll attempt to reproduce today

Hey @healeyq3 , know you guys must be very busy merging new features. If you don't have time digging into this, would you mind sharing the package installation steps in order to run the demo "PyTorch on GPU with CuClarabel"?

I installed cvxpylayers by git clone XX cvxpylayers and pip install ./cvxpylayers with version 1.0.0b0. Still ran into the same exception.

Hello @alexa7b5 (and cc @autumnwt ),

Sorry for the delay in debugging the above error and creating a proper installation setup guide. I put a few hours toward this last weekend, but was running into some issues with Julia myself.

Starting this Saturday through January 5th I won't have work, so will spend some calories on CVXPYlayers (GPU) development.

Thank you for your interest in the project!

Dec 18 '25 15:12 healeyq3

Hello developers, I have encountered a similar problem. I am trying to run the official example ("PyTorch on GPU with CuClarabel") using the CUCLARABEL interface on a PyTorch 2.x environment with CUDA 12. The forward pass works, but I encounter the following issues:

https://github.com/cvxpy/cvxpylayers/issues/173
/root/miniconda3/envs/opf-gpu/lib/python3.12/site-packages/cvxpylayers/torch/cvxpylayer.py:304: UserWarning: Sparse CSR tensor support is in beta state. If you miss a functionality in the sparse tensor support, please submit a feature request to https://github.com/pytorch/pytorch/issues. (Triggered internally at /pytorch/aten/src/ATen/SparseCsrTensorImpl.cpp:53.)
  torch_csr = torch.sparse_csr_tensor(
┌ Warning: CUDA runtime library `libcublasLt.so.12` was loaded from a system path, `/root/miniconda3/envs/opf-gpu/lib/python3.12/site-packages/cutensor/lib/../../nvidia/cublas/lib/libcublasLt.so.12`.
│ This may cause errors.
│ 
│ If you're running under a profiler, this situation is expected. Otherwise,
│ ensure that your library path environment variable (e.g., `PATH` on Windows
│ or `LD_LIBRARY_PATH` on Linux) does not include CUDA library paths.
│ 
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/x8d2s/src/initialization.jl:216

┌ Warning: CUDA runtime library `nvrtc.cpython-312-x86_64-linux-gnu.so` was loaded from a system path, `/root/miniconda3/envs/opf-gpu/lib/python3.12/site-packages/cuda/bindings/nvrtc.cpython-312-x86_64-linux-gnu.so`.
│ This may cause errors.
│ 
│ If you're running under a profiler, this situation is expected. Otherwise,
│ ensure that your library path environment variable (e.g., `PATH` on Windows
│ or `LD_LIBRARY_PATH` on Linux) does not include CUDA library paths.
│ 
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/x8d2s/src/initialization.jl:216
┌ Warning: CUDA runtime library `cynvrtc.cpython-312-x86_64-linux-gnu.so` was loaded from a system path, `/root/miniconda3/envs/opf-gpu/lib/python3.12/site-packages/cuda/bindings/cynvrtc.cpython-312-x86_64-linux-gnu.so`.
│ This may cause errors.
│ 
│ If you're running under a profiler, this situation is expected. Otherwise,
│ ensure that your library path environment variable (e.g., `PATH` on Windows
│ or `LD_LIBRARY_PATH` on Linux) does not include CUDA library paths.
│ 
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/x8d2s/src/initialization.jl:216
┌ Warning: CUDA runtime library `cynvrtc.cpython-312-x86_64-linux-gnu.so` was loaded from a system path, `/root/miniconda3/envs/opf-gpu/lib/python3.12/site-packages/cuda/bindings/_bindings/cynvrtc.cpython-312-x86_64-linux-gnu.so`.
│ This may cause errors.
│ 
│ If you're running under a profiler, this situation is expected. Otherwise,
│ ensure that your library path environment variable (e.g., `PATH` on Windows
│ or `LD_LIBRARY_PATH` on Linux) does not include CUDA library paths.
│ 
│ In any other case, please file an issue.
└ @ CUDA ~/.julia/packages/CUDA/x8d2s/src/initialization.jl:216
Traceback (most recent call last):
  File "/opt/data/private/E2e-OPF-GPU/test_environ/test_torch_cuclarabel_0.py", line 38, in <module>
    solution.sum().backward()
  File "/root/miniconda3/envs/opf-gpu/lib/python3.12/site-packages/torch/_tensor.py", line 626, in backward
    torch.autograd.backward(
  File "/root/miniconda3/envs/opf-gpu/lib/python3.12/site-packages/torch/autograd/__init__.py", line 347, in backward
    _engine_run_backward(
  File "/root/miniconda3/envs/opf-gpu/lib/python3.12/site-packages/torch/autograd/graph.py", line 823, in _engine_run_backward
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: function _CvxpyLayerBackward returned a gradient different than None at position 1, but the corresponding forward input was not a Variable

Environment:

Python 3.12, PyTorch 2.6.1
CUDA 12
JAX 0.8.2
Julia 1.10.10
CVXPYLayers installed from the latest release

Dec 19 '25 14:12 JyiWu-git

I successfully computed some gradients via the CuClarabel backend and started adding a setup guide to the docs, but unfortunately when I tried the setup steps on a different machine I once again ran into the issues I saw ~ 1.5 weeks ago on a third machine. I opened this issue in CuClarabel with steps to reproduce my success and failure. I'm inclined to believe our various failures all stem from setup issues.

Dec 23 '25 22:12 healeyq3