GPU Solver Enabled Via Env Variable
Created a PyNite Solvers module to handle importing of numpy / scipy solvers or the creation of torch based GPU solvers with the same names
Torch GPU Solver is only used when the env-var PYNITE_GPU is "True", which will alert the user that it isn't set with a "GPU not available: PYNITE_GPU enviornonmental variable not set to True"
If torch isn't installed or otherwise has a configuration error the user will get a corresponding error message "GPU not available:
The only other thing I would add to this is a GPU field in the extra_require section of setup.py for torch
The testing suite passes in ~40s vs ~60s on a nvidia 4070 on my machine.
This is an interesting thought. From what I've gathered, GPU calculations are much faster for dense matrices, but don't handle sparse matrices. I see your code converts the sparse matrices back to dense matrices prior to the GPU solution. That is a bit misleading to the user if they request a sparse solution and get a dense GPU solution instead.
Is GPU solution of a dense matrix faster than a CPU solution of a sparse matrix? I'm not sure. Most stiffness matrices are sparse (lots of zero terms except around the diagonal of the matrix). I think your code is going to be faster for small models, and possibly slower for large models.
It seems that your code is making the GPU solution the default. I think it would be better to leave the existing solvers as the default, with the GPU solver as a third option. If we play with it and feel it is faster all-around we could later change it to be the default.
I don't want to complain about code you shared freely, but it'd be nice to have more comments in the code. I'm not familiar with pytorch, so I had to use ChatGPT to help me dissect what was going on here. I appreciate you sharing this idea.
Thanks for offering up this contribution. At first impression, I thought GPU calcs would be a huge speed increase to the performance of models and was quite excited to have a simple way to run PyNite through the GPU. But after profiling, the matrix math part of it wasn't the bottle neck, which is why I think you're only seeing a 33% increase in performance over the whole of the test suite. I think we should hold off on having this be a part of the library until we eliminate most of the other bottlenecks. And then after that point, we circle back to something like this.
@JWock82 do you want to close this PR for now?
@JWock82 @JBloss1517
Apologize for not getting back to this sooner, I do think there are alot of optimization caveats that go along with GPU computing which is why I was somewhat hesitant to submit this, however I thought it would be interesting to get a conversation going about it.
The torch docs provide some tips on optimization, but I think the real long and short of this is that you need to build the matrix in the GPU in pynite to be efficient which is a major change to low level code. https://pytorch.org/docs/stable/sparse.html
Barring, a complete rework of matrix assembly The only option post-construction would be to convert the matix after creation as per this example.
Since every solve is some version of Ax-b=0, I think maybe the solution is to re-define the solver process. This library has accomplished a nice user-customizable framework for the linear solver. https://capytaine.org/stable/user_manual/resolution.html#engine
Implementation for this library looks pretty similar to this PR but without the sparsity (its a fully dense matrix for a BEM problem) https://github.com/capytaine/capytaine/pull/669/files
Allowing user-customization I think will increase adoption and allow folks to work on new types of analysis, without the pain of PR approval :)
Thanks for following up on this. The idea of being able to swap out matrix engines is an interesting one, but would also likely require a decent amount of work to make it easily swapable. Personally, I would rather tackle speeding up other areas of the project than building in an ability to swap out the matrix library--at least right now. Because scipy's sparse matrix isn't a slow implementation for matrix math, I don't think this would be the right thing to add to the project at this point. Thanks again for the PR and ideas!
@JBloss1517 @JWock82 Sure thing your call on this.
I agree scipy is a pretty good choice in general for a solver, however performance is a relative metric. The speed up offered by GPU's can help solve larger structural issues, or help with finding eigenvalues for vibrational problems ect. My work with aerospace structures often involves a large number thin panels, where stability is a critical issue.
Here is a benchmark I put together for similar gpu speed up https://github.com/capytaine/capytaine/pull/446#issuecomment-1875883490