FasterTransformer icon indicating copy to clipboard operation
FasterTransformer copied to clipboard

Sparsity support

Open zhang662817 opened this issue 2 years ago • 0 comments

Branch/Tag/Commit

main

Docker Image Version

pytorh

GPU name

A100

CUDA Driver

main

Reproduced Steps

In support matrix, only bert and encode support Sparsity (after Ampere);
Did you have plan to support other models?

If not, what's the main reason?
no perfencemance improvement or accuracy drop?

Thanks.

zhang662817 avatar Oct 29 '23 08:10 zhang662817