jxt1234
jxt1234
模型中的 config.json 内容是什么?
aeac75acbf7b3f6be8272dde61f21eee1c9928f7 已经修正,目前需要更新后重编 MNN
mnn 的 resize 基于 NC4HW4 布局实现,会隐式进行输入输出的布局转换。估计转换时间占了较大部分
> Well, regardless of the Gemma 3 version chosen (including non-QAT one), llmexport.py responds with `AttributeError: 'NoneType' object has no attribute 'weight'`. Which kind of gemma 3 model are you...
See https://mnn-docs.readthedocs.io/en/latest/transformers/llm.html , or execute python3 llmexport -h you can see the quant option. ``` --awq Whether or not to use awq quant. --sym Whether or not to using symmetric...
mnn's quant has much more degree of freedom than llama.cpp. Normally weight quant block=64 has the same precision to Q4_1. While mnn's 32 block has higher precision than Q4_1 and...
正在实现中,预计本月会支持
可能是 Vulkan 驱动的问题吧。。
Great work. We will see it later.