sdecoder issues

Results 6 issues of


                                            sdecoder

torch.OutOfMemoryError when try to build TensorRT engines for Qwen2-72B(-Instruct)

Greetings, everyone. 1. Our hardware configuration is one GPU server with 4xA30(24GB), ubuntu server OS, as well as some general server CPU and 512GB+ memory. 2. We are now attempting...

Does AutoAWQ support multi-threading CPU?

Greetings everyone. 1. Server configuration: a modern CPU with multiple cores/large memory/a relatively weak GPU with insufficient VRAM(16G); 2. In such case, it is impossible to use the GPU to...

TensorRT-LLM[Branch v0.12.0-jetson] [trtllm-build killed due to insufficient memory][Phi-3-medium-128k-instruct]

Greetings, everyone. 0. I am trying to use TensorRT-LLM[Branch v0.12.0-jetson] to deploy the microsoft--Phi-3-medium-128k-instruct LLM. The guidance could be found here: https://github.com/NVIDIA/TensorRT-LLM/tree/v0.12.0-jetson/examples/phi 1. The scripts are listed as following: 1.1...

not a bug

TensorRT-LLM[Branchv0.12.0-jetson] Quick confirmation: Gemma2 not supported yet?

Greetings everyone. 0. I am trying to use TensorRT-LLM to deploy Gemma2 LLM on the **Jetson AGX Orin platform.** 1. I am going through this instruction: https://github.com/NVIDIA/TensorRT-LLM/tree/v0.12.0-jetson/examples/gemma 2. I downloaded...

Failed to build TensorRT-LLM backend for Triton server.

Greetings, I have come across following issue when trying to build TensorRT-LLM backend for Triton server: **/home/nvidia/projects/triton-inference-server/tensorrtllm_backend/inflight_batcher_llm/../tensorrt_llm/cpp/include/tensorrt_llm/common/dataType.h:40:30: error: ‘kFP4’ is not a member of ‘nvinfer1::DataType’; did you mean ‘kFP8’?** I...

TensorRT engine execution return all 0 due to warning[ScatterND with reduction==‘none’ only guarantees to be correct if indices are not duplicated.]

Greetings, everyone. Right now I am working on this project: [GitHub - facebookresearch/sam2: The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links...

triaged