PyNite icon indicating copy to clipboard operation
PyNite copied to clipboard

GPU Solver Enabled Via Env Variable

Open SoundsSerious opened this issue 1 year ago • 1 comments

Created a PyNite Solvers module to handle importing of numpy / scipy solvers or the creation of torch based GPU solvers with the same names

Torch GPU Solver is only used when the env-var PYNITE_GPU is "True", which will alert the user that it isn't set with a "GPU not available: PYNITE_GPU enviornonmental variable not set to True"

If torch isn't installed or otherwise has a configuration error the user will get a corresponding error message "GPU not available: "

The only other thing I would add to this is a GPU field in the extra_require section of setup.py for torch

The testing suite passes in ~40s vs ~60s on a nvidia 4070 on my machine.

SoundsSerious avatar Jul 25 '24 19:07 SoundsSerious

This is an interesting thought. From what I've gathered, GPU calculations are much faster for dense matrices, but don't handle sparse matrices. I see your code converts the sparse matrices back to dense matrices prior to the GPU solution. That is a bit misleading to the user if they request a sparse solution and get a dense GPU solution instead.

Is GPU solution of a dense matrix faster than a CPU solution of a sparse matrix? I'm not sure. Most stiffness matrices are sparse (lots of zero terms except around the diagonal of the matrix). I think your code is going to be faster for small models, and possibly slower for large models.

It seems that your code is making the GPU solution the default. I think it would be better to leave the existing solvers as the default, with the GPU solver as a third option. If we play with it and feel it is faster all-around we could later change it to be the default.

I don't want to complain about code you shared freely, but it'd be nice to have more comments in the code. I'm not familiar with pytorch, so I had to use ChatGPT to help me dissect what was going on here. I appreciate you sharing this idea.

JWock82 avatar Sep 07 '24 19:09 JWock82

Thanks for offering up this contribution. At first impression, I thought GPU calcs would be a huge speed increase to the performance of models and was quite excited to have a simple way to run PyNite through the GPU. But after profiling, the matrix math part of it wasn't the bottle neck, which is why I think you're only seeing a 33% increase in performance over the whole of the test suite. I think we should hold off on having this be a part of the library until we eliminate most of the other bottlenecks. And then after that point, we circle back to something like this.

@JWock82 do you want to close this PR for now?

JBloss1517 avatar Apr 29 '25 19:04 JBloss1517

@JWock82 @JBloss1517

Apologize for not getting back to this sooner, I do think there are alot of optimization caveats that go along with GPU computing which is why I was somewhat hesitant to submit this, however I thought it would be interesting to get a conversation going about it.

The torch docs provide some tips on optimization, but I think the real long and short of this is that you need to build the matrix in the GPU in pynite to be efficient which is a major change to low level code. https://pytorch.org/docs/stable/sparse.html

Barring, a complete rework of matrix assembly The only option post-construction would be to convert the matix after creation as per this example.

Since every solve is some version of Ax-b=0, I think maybe the solution is to re-define the solver process. This library has accomplished a nice user-customizable framework for the linear solver. https://capytaine.org/stable/user_manual/resolution.html#engine

Implementation for this library looks pretty similar to this PR but without the sparsity (its a fully dense matrix for a BEM problem) https://github.com/capytaine/capytaine/pull/669/files

Allowing user-customization I think will increase adoption and allow folks to work on new types of analysis, without the pain of PR approval :)

SoundsSerious avatar Apr 29 '25 22:04 SoundsSerious

Thanks for following up on this. The idea of being able to swap out matrix engines is an interesting one, but would also likely require a decent amount of work to make it easily swapable. Personally, I would rather tackle speeding up other areas of the project than building in an ability to swap out the matrix library--at least right now. Because scipy's sparse matrix isn't a slow implementation for matrix math, I don't think this would be the right thing to add to the project at this point. Thanks again for the PR and ideas!

JBloss1517 avatar May 03 '25 16:05 JBloss1517

@JBloss1517 @JWock82 Sure thing your call on this.

I agree scipy is a pretty good choice in general for a solver, however performance is a relative metric. The speed up offered by GPU's can help solve larger structural issues, or help with finding eigenvalues for vibrational problems ect. My work with aerospace structures often involves a large number thin panels, where stability is a critical issue.

Here is a benchmark I put together for similar gpu speed up https://github.com/capytaine/capytaine/pull/446#issuecomment-1875883490

SoundsSerious avatar May 05 '25 15:05 SoundsSerious