TensorRT-LLM
                                
                                 TensorRT-LLM copied to clipboard
                                
                                    TensorRT-LLM copied to clipboard
                            
                            
                            
                        NVIDIA AMMO documentation
Is there any official documentation of NVIDIA AMMO toolkit? If so, where is it?
In particular, I'd be interested in documentation about:
- implemented features
- supported quantization techniques for each model type
- changelog between versions
@Tracin @juney-nvidia
@RalphMao do you have any comments on this ask? :)
Same. How I can find source code of this library? I want to write custom quantization pipeline for encoder-decoder models like T5.
same question here.
x2
Hi folks! Are there updates on the docs?
+1
Hi all, thank you for your interest. The AMMO toolkit has been renamed to "TensorRT model optimizer" and the documentation is available at https://nvidia.github.io/TensorRT-Model-Optimizer/ . Examples related with Model Optimizer is available at https://github.com/NVIDIA/TensorRT-Model-Optimizer?tab=readme-ov-file
Same. How I can find source code of this library? I want to write custom quantization pipeline for encoder-decoder models like T5.
The library is available on PyPi, with source open (instead of open source). You can access most of the files but some files doesn't have the approval for open source (yet)
Hi @dmitrymailk , I am also exploring ways to run 4bit quantized encoder - decoder model in tensorrt-llm. Where you able to make any progress on that front ?
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 15 days."