Kimiko

Results 3 issues of Kimiko

https://github.com/NVIDIA/FasterTransformer Implementing FasterTransformer may increase speed, reduce memory usage for many models

enhancement

@merrymercy Check this github issue

# Description https://arxiv.org/pdf/2306.06101.pdf https://arxiv.org/abs/2305.14342 ## Motivation and Context ## How has this been tested? No ## Screenshots (if appropriate) ## Types of changes ## Social Handles (Optional)