fastertransformer_backend
fastertransformer_backend copied to clipboard
### Description ``` Branch: main GPU: NVIDIA V100S Docker version: 20.10.16 ``` When re-loading a model that has been previously loaded and unloaded on time.\ The backend crash with the...
I followed the tutorial provided [here](https://github.com/triton-inference-server/fastertransformer_backend/blob/22dba92dc1cbd367d119520013ec365b313a63ba/docs/gptj_guide.md). I am able to run GPTJ-B on a single node. However, when I try the multi-node inference example with the following command on two...
### Description ```shell Branch: main Docker version: 22.03 GPU type: 2x NVIDIA RTX A6000 ``` ### Reproduced Steps 1. Load a model with the fastertransformer backend. 2. Make a query...