SpQR
SpQR copied to clipboard
Add Support for Efficient Inference
This PR adds support for the following:
- Efficient SPQR CUDA-based matvec kernel implementation for a subset of paramaters
- Integration of said kernel for end-to-end inference
- Kernel benchmarks
- End-to-end inference demo and benchmarks