TensorRT-LLM
TensorRT-LLM copied to clipboard
enable medusa int8 weight only quantization
@kaiyux, could you help review it?
@XiaobingSuper Thanks for the support! We will review the changes in the internal codebase and get back to you.
Hi @XiaobingSuper , we've merged your changes into main branch and thanks a lot for your contributing.