wejoncy

Results 22 comments of wejoncy

> This seems to be an issue with the quantized model, looks like one of (or all) the layers doesn't have a config defined for it. Maybe [@wejoncy](https://github.com/wejoncy) has an...

Hi @AlpinDale @ZanePoe, Fow now, quantized model requires `prefix` to imply which layer it is., just like vLLM/sglang did, https://github.com/vllm-project/vllm/blob/54a8804455a14234ba246f7cbaf29fb5e8587d64/vllm/model_executor/models/qwen2.py#L80C1-L81C1. Seems like we might need a PR to support str[`prefix`]...