cudarc
cudarc copied to clipboard
More cudnn ops
Thanks for this amazing crate, it's been instrumental to candle. We've recently added a feature to use the cudnn conv2d which sped things up a lot compared to our handcrafted kernel, and would like to have cudnn support for more ops. Mostly interested in:
- Conv2d backprop.
- Conv1d forward + backward.
- Maybe flash-attention/softmax/... Are there any plans to add these to the cudnn safe api? If not would you be ok with people making PR to add it?