swift icon indicating copy to clipboard operation
swift copied to clipboard

希望能应用TensorRT加速训练和推理

Open WSC741606 opened this issue 1 month ago • 0 comments

Describe the feature TensorRT10发布了,同时还有TensorRT-LLM,是否能用其对训练和推理加速呢?

Paste any useful information 下述来自NVIDIA的推广邮件

The TensorRT ecosystem of API releases include TensorRT 10.0, TensorRT-LLM 0.10, and TensorRT Model Optimizer 0.11.

Highlights from this release include: TensorRT 10: support for weight-stripped engines, weight offload for NVIDIA Grace Hopper™ systems, Python 3.12 TensorRT-LLM 0.10: Llama3, Phi3, Grok1, and more; FP8 MoEs; improved API simplicity and consolidation TensorRT Model Optimizer 0.11: provides state-of-the-art techniques like quantization and sparsity to reduce model complexity

WSC741606 avatar May 16 '24 03:05 WSC741606