LightCompress
LightCompress copied to clipboard
How to use Tensorrt-LLM as backend
as describe in title
You can set save_fp in llmc to True. Then you can use trt-llm ammo to convert a naive quant engine.
THX for your reply. I have set save_fp in llmc to True, and these are files saved by llmc, how can I use trt-llm ammo to convert a naive quant engine.
remark,I need your help
The following process needs to modify some codes to change the default settings in TensorRT-LLM. To help users use our tool more conveniently, we are rushing an official doc page about the tool. Please wait for our news patient.