bobbych94

Results 16 comments of bobbych94

```python import json import trt_pose.coco import trt_pose.models import torch class InputReNormalization(torch.nn.Module): """ This defines "(input - [0.485, 0.456, 0.406]) / [0.229, 0.224, 0.225]" custom operation to conform to "Unit" normalized...

> @nanmi您是否设法运行 yolox-tiny 或 yolox-nano? I haven't tried it, but please share it if you verify it☺

We do not need to export NMS. NMS is in the post-processing part

Or, do you have reference codes for other repositories that you recommend that apply DCA technology? Can you share them?

> 您好,感谢您的关注! DCA几乎可以用于Hugging Face上发布的所有LLM。如果您发现特定模型具有挑战性,请随时提出问题。 > > 对于 CUDA 优化,我们正在积极致力于此,但与原始推理代码相比,此存储库中的代码不应存在明显的 GPU 内存或推理时间问题。 Thanks for replying to the message. I want to apply DCA, a super useful technology, under the CUDA framework....

有可能是计算图中算子太多导致的,这种情况该如何打开呢,您有什么建议吗

> Seems H20's cuda architecture is not recognized, can you specify the environment variable: > > ``` > export TORCH_CUDA_ARCH_LIST=9.0 > ``` not work, I have tried setting the environment...

> Hello! May I ask if you can share the steps to preproduce the steps? Do you mean building the model or building tensorrt_llm source code? If you could add...

> > Any recent updates regarding the support for DeepSeek MOE? > > deepseek MOE hopefully will be appear in main branch in next week, deepseek v2 (MLA+MOE) hopefully will...

> deepseek v1 is ready to go, should appear in main branch in early next week, v2 we are still tuning , we are targeting to get the close perf...