fastertransformer_backend
fastertransformer_backend copied to clipboard
enable llama model in FT backend
existing FT backend will throw error for llama model.
Will this ever work? I didn't see llama
defined under: https://github.com/NVIDIA/FasterTransformer/tree/main/src/fastertransformer/triton_backend