A-ML-ER comments

Results 6 comments of


                                            A-ML-ER

Support for Rotary Embeddings for Llama

How to convert Llama structure into Faster transformer sturcture ? it seem has 32 layers with LlamaRotaryEmbedding ?

Please help me /(ㄒoㄒ)/~~ ! failed to load 'fastertransformer' version 1: Unsupported: 1.

build the project inside the container or build on the ECS?

Support multiple nodes deployment?

any update for multi node deployment ?

compile my own backend, libtriton_fastertransformer.so undefined symbol:

I0404 14:43:41.957637 63955 server.cc:594] +-------------------+---------+-----------------------------------------------------------------------------------------------------+ | Model | Version | Status | +-------------------+---------+-----------------------------------------------------------------------------------------------------+ | fastertransformer | 1 | UNAVAILABLE: Not found: unable to load shared library: /opt/tritonserver/backends/fastertransformer | | |...

Build backend inside the docker container, undefined symbol

/ft_workspace/fastertransformer_backend the source git clone from https://github.com/triton-inference-server/fastertransformer_backend.git

Build backend inside the docker container, undefined symbol

https://github.com/triton-inference-server/fastertransformer_backend.git main branch the latest one