fastertransformer_backend issues

Failing to build with triton 23.04

2

### Description ```shell System H100 and L40, driver 530.30.02 and cuda 12.1 Build is failing with following error, no mather the branch used (main, v1.4, fix/multi_instance, etc..) [ 54%] Built...

bronzafa

bug

huggingface_bert_convert.py can't convert some key

### Description branch: `v1.4` docker version: `22.12` `huggingface_bert_convert.py` can't convert some key ``` python3 FasterTransformer/examples/pytorch/bert/utils/huggingface_bert_convert.py \ -in_file bert-base-uncased/ \ -saved_dir ${WORKSPACE}/all_models/bert/fastertransformer/1/ \ -infer_tensor_para_size 1 ``` Response: ``` =============== Argument ===============...

SeungjaeLim

bug

repo fails to build using Triton Image 23.01

2

### Description ```shell main branch as of 02/13/2023 Build crashes at 57% with no additional information. I was able to successfully build using 22.09 today to validate that nothing on...

Chris113113

bug

int8 support for gptj&gptneox

rahuan

Is deberta supported in the fastertranformer backend?

Looks like the answer is no and it will fail if we put a deberta model: https://github.com/triton-inference-server/fastertransformer_backend/blob/main/src/libfastertransformer.cc#L333. Is there future plan to support deberta?

sfc-gh-zhwang

FasterTransformer Backend fails to build using latest version of Triton Server

2

### Description ```shell The Docker built fine using the older version mentioned in readme (22.12), but when trying to build using the latest docker (23.05) it fails. See this log...

mshuffett