Ai-ZL

Results 4 issues of Ai-ZL

When I try the quantization step in this code, it cannot continue to run. An error occurs: CUDA out of memory. What can I do to solve this issue in...

question

# Bug Report ### Describe the bug I use quantize_static and convert_float_to_float16 functions in onnx to convert fp32 model to fp16 + int8 model. The fp16 model can inference through...

bug

Hello! I noticed that quantization was used in the article https://arxiv.org/pdf/[2102.01547](https://arxiv.org/pdf/2102.01547). Could you please tell me what quantization method was used? Additionally, I would like to ask if there are...

Hello, can llmc support Whisper model quantization? Or what modifications need to be made to llmc to support quantization of the Whisper model?