tensorrtllm_backend
tensorrtllm_backend copied to clipboard
The Triton TensorRT-LLM Backend
### System Info CPU:x86 SYS:Ubuntu GPU:A100-SXM x 8, Driver Version: 470.161.03 CUDA Version: 12.2 trtllm: 0.6.1 triton: 2.1.0 ### Who can help? _No response_ ### Information - [X] The official...
``` { name: "OUT_CUM_LOG_PROBS" data_type: TYPE_FP32 dims: [ -1 ] }, { name: "OUT_OUTPUT_LOG_PROBS" data_type: TYPE_FP32 dims: [ -1, -1 ] } ``` I get the output this way and...
### System Info - CPU architecture (x86_64) - CPU/Host memory size (64GB) - GPU properties - GPU name (1x NVIDIA V100) - GPU memory size (32GB) - Libraries - TensorRT-LLM...
I installed tensorrtllm_backend in the follow way: 1. `docker pull nvcr.io/nvidia/tritonserver:23.12-trtllm-python-py3` 2. `docker run -v /data2/share/:/data/ -v /mnt/sdb/benchmark/xiangrui:/root -it -d --cap-add=SYS_PTRACE --cap-add=SYS_ADMIN --security-opt seccomp=unconfined --gpus=all --shm-size=16g --privileged --ulimit memlock=-1 --name=develop...
Hi, I'm trying to use tensorrt-llm with Triton server, but it cannot find my model. any idea why? it looks like my file is invalid: ```/tensorrtllm_backend/triton_model_repo/tensorrt_llm_bls/config.pbtxt``` here is the config.pbtxt...
### System Info CPU x86_64 GPU NVIDIA L20 TensorRT branch: v0.8.0 CUDA: NVIDIA-SMI 535.154.05 Driver Version: 535.154.05 CUDA Version: 12.3 ### Who can help? @byshiue ### Information - [X] The...
### System Info Environment: 2 NVIDIA A100 with nvlink Tensorrt-LLM Backend version v0.8.0 LLAMA2 engine built with paged_kv_cache and tp_size 2, world size 2 X86_64 arch ### Who can help?...
The server seems to be ok with the following log. ``` I1212 03:29:51.067415 37860 server.cc:674] +----------------+---------+--------+ | Model | Version | Status | +----------------+---------+--------+ | ensemble | 1 | READY...
Environment CPU architecture/: x86_64 CPU/Host memory size (if known): 167GB GPU properties GPU name: A100 GPU memory size: 80gb Libraries TensorRT-LLM backend branch or tag: main TensorRT-LLM backend commit (if...
### System Info * CPU Architecture - x86_64 * CPU/Host memory size - 330gb from `/proc/meminfo` * GPU name - NVIDIA H100 80GB HBM3 * GPU memory size - 81559MiB...