fastertransformer_backend icon indicating copy to clipboard operation
fastertransformer_backend copied to clipboard

Results 73 fastertransformer_backend issues
Sort by recently updated
recently updated
newest added

I did some performace test of a 3.5B bloom 1-gpu model using perf_analyzer, the result is |batch size | avg latency | | ----------- | ----------- | | 1 |...

How we can run triton with fastertransfer backend for flan-ul2-alpaca-lora? Please share the steps. how to do this?

Cany any one share working config file for flan-ul2-alpaca-lora for triton?

Cany any one share working config file for flan-ul2 for triton?

Is it possible to integrate converter scripts for the GPTBigCodeForCausalLM architecture from the transformers libary? This would enable integration of models like Starcoder / Santacoder. With this, community projects like...

### Description ```shell main, A100 ``` ### Reproduced Steps ```shell Hi I'm experimenting with gpt models using triton + fastertransformer_backend. I installed it according to the docs/gpt_guide.md in the docs...

bug

### Description ```shell I start triton server with '--model-control-mode poll'. Segmentation fault occurs when modifying the model directory. ``` ### Reproduced Steps ```shell 1.CUDA_VISIBLE_DEVICES=3,4,5,6 /opt/tritonserver/bin/tritonserver --model-repository=/ft_workspace/all_models/t5/ --http-port 8008 --model-control-mode poll...

bug

### Description ```shell UNAVAILABLE: Not found: unable to load shared library: /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so: undefined symbol: _ZN22ParallelGptTritonModelI6__halfE8toStringB5c I don't change ParallelGptTritonModel related code. But when start Triton server, it always fail. ```...

bug

### Description ```shell Branch: main Docker Version: 20.10.21 GPU Type: A100 40GB Triton Docker Image: triton_with_ft:22.12 ``` ### Reproduced Steps I'm following the instructions by @byshiue to test Flan-T5 with...

bug

### Description Hi, I'm trying to run triton:22.03 / FasterTransformer within a kubernetes pod. Running ``` CUDA_VISIBLE_DEVICES=0 mpirun -n 1 --allow-run-as-root /opt/tritonserver/bin/tritonserver --model-repository=${WORKSPACE}/all_models/gptj/ ``` gives me this error: ``` what():...

bug