jxt1234 comments

Results 338 comments of


                                            jxt1234

小米14使用Qwen2.5-Omni-3B无限回复感叹号

aeac75acbf7b3f6be8272dde61f21eee1c9928f7 已经修正，目前需要更新后重编 MNN

Make a wish for supporting Qwen2.5-Omni

正在做

cuda下resize的性能是否有问题？或者是我的用法不正确

mnn 的 resize 基于 NC4HW4 布局实现，会隐式进行输入输出的布局转换。估计转换时间占了较大部分

[Question/Improvement] Specify the quantization method applied by llmexport.py and MNNConvert

> Well, regardless of the Gemma 3 version chosen (including non-QAT one), llmexport.py responds with `AttributeError: 'NoneType' object has no attribute 'weight'`. Which kind of gemma 3 model are you...

[Question/Improvement] Specify the quantization method applied by llmexport.py and MNNConvert

See https://mnn-docs.readthedocs.io/en/latest/transformers/llm.html , or execute python3 llmexport -h you can see the quant option. ``` --awq Whether or not to use awq quant. --sym Whether or not to using symmetric...

[Question/Improvement] Specify the quantization method applied by llmexport.py and MNNConvert

mnn's quant has much more degree of freedom than llama.cpp. Normally weight quant block=64 has the same precision to Q4_1. While mnn's 32 block has higher precision than Q4_1 and...

Will MNN Chat app support ”Speculative Decoding“？

正在实现中，预计本月会支持

Linux系统使用vulkan未能启用gpu加速

可能是 Vulkan 驱动的问题吧。。

rust wrapper over mnn / mnn 上的 rust 包装器

Great work. We will see it later.