LightCompress icon indicating copy to clipboard operation
LightCompress copied to clipboard

How to use Tensorrt-LLM as backend

Open Worromots opened this issue 1 year ago • 2 comments

as describe in title

Worromots avatar May 20 '24 09:05 Worromots

You can set save_fp in llmc to True. Then you can use trt-llm ammo to convert a naive quant engine.

helloyongyang avatar May 20 '24 09:05 helloyongyang

THX for your reply. I have set save_fp in llmc to True, and these are files saved by llmc, how can I use trt-llm ammo to convert a naive quant engine. image

Worromots avatar May 22 '24 12:05 Worromots

remark,I need your help

Worromots avatar Jun 06 '24 09:06 Worromots

The following process needs to modify some codes to change the default settings in TensorRT-LLM. To help users use our tool more conveniently, we are rushing an official doc page about the tool. Please wait for our news patient.

Harahan avatar Jun 07 '24 21:06 Harahan