mlc-llm [Question] Deployment of Pruned Models

[Question] Deployment of Pruned Models

Open qianjyM opened this issue 9 months ago • 0 comments

❓ General Questions

Hi there,

I just want to ask that for the pruned model, how can we deploy it using MLC-LLM? Since the qkv dimensions in each layer are different, the model is stored using torch.save rather than save_pretrained. So I'm a little confused about how to use MLC-LLM with this model? Could you please give me some tips or advice?

Thanks!

May 14 '24 08:05 qianjyM

mlc-llm mlc-llm copied to clipboard

[Question] Deployment of Pruned Models

❓ General Questions

mlc-llm
mlc-llm copied to clipboard