FasterTransformer
FasterTransformer copied to clipboard
Sparsity support
Branch/Tag/Commit
main
Docker Image Version
pytorh
GPU name
A100
CUDA Driver
main
Reproduced Steps
In support matrix, only bert and encode support Sparsity (after Ampere);
Did you have plan to support other models?
If not, what's the main reason?
no perfencemance improvement or accuracy drop?
Thanks.