TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

NVIDIA AMMO documentation

Open fedem96 opened this issue 1 year ago • 9 comments

Is there any official documentation of NVIDIA AMMO toolkit? If so, where is it?

In particular, I'd be interested in documentation about:

  • implemented features
  • supported quantization techniques for each model type
  • changelog between versions

@Tracin @juney-nvidia

fedem96 avatar Mar 28 '24 13:03 fedem96

@RalphMao do you have any comments on this ask? :)

juney-nvidia avatar Apr 01 '24 07:04 juney-nvidia

Same. How I can find source code of this library? I want to write custom quantization pipeline for encoder-decoder models like T5.

dmitrymailk avatar Apr 17 '24 15:04 dmitrymailk

same question here.

yao-matrix avatar May 07 '24 07:05 yao-matrix

x2

puppetm4st3r avatar May 09 '24 05:05 puppetm4st3r

Hi folks! Are there updates on the docs?

ChristianPala avatar May 17 '24 08:05 ChristianPala

+1

lix19937 avatar May 17 '24 10:05 lix19937

Hi all, thank you for your interest. The AMMO toolkit has been renamed to "TensorRT model optimizer" and the documentation is available at https://nvidia.github.io/TensorRT-Model-Optimizer/ . Examples related with Model Optimizer is available at https://github.com/NVIDIA/TensorRT-Model-Optimizer?tab=readme-ov-file

RalphMao avatar May 17 '24 20:05 RalphMao

Same. How I can find source code of this library? I want to write custom quantization pipeline for encoder-decoder models like T5.

The library is available on PyPi, with source open (instead of open source). You can access most of the files but some files doesn't have the approval for open source (yet)

RalphMao avatar May 17 '24 20:05 RalphMao

Hi @dmitrymailk , I am also exploring ways to run 4bit quantized encoder - decoder model in tensorrt-llm. Where you able to make any progress on that front ?

ashwin-js avatar May 20 '24 02:05 ashwin-js

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 15 days."

github-actions[bot] avatar Jun 20 '24 01:06 github-actions[bot]