A-ML-ER

Results 6 comments of A-ML-ER

How to convert Llama structure into Faster transformer sturcture ? it seem has 32 layers with LlamaRotaryEmbedding ?

any update for multi node deployment ?

I0404 14:43:41.957637 63955 server.cc:594] +-------------------+---------+-----------------------------------------------------------------------------------------------------+ | Model | Version | Status | +-------------------+---------+-----------------------------------------------------------------------------------------------------+ | fastertransformer | 1 | UNAVAILABLE: Not found: unable to load shared library: /opt/tritonserver/backends/fastertransformer | | |...

/ft_workspace/fastertransformer_backend the source git clone from https://github.com/triton-inference-server/fastertransformer_backend.git

https://github.com/triton-inference-server/fastertransformer_backend.git main branch the latest one