MNN icon indicating copy to clipboard operation
MNN copied to clipboard

Qwen3 0.6B loops with default settings

Open iAdanos opened this issue 8 months ago • 12 comments

Android app, Qwen 0.6B loops 100 persent cases for me

iAdanos avatar May 09 '25 22:05 iAdanos

is your sampler settings greedy? the model has changed, if you downloaded the model earlier, the sampler settings may not be right, redownload the model or change sampler settings to mixed will fix the problem Screenshot_2025-05-10-07-49-45-10_3b5e1a6b5f5bfd395bd36f2cf39e76d0.jpg

Screenshot_2025-05-10-07-49-51-14_3b5e1a6b5f5bfd395bd36f2cf39e76d0.jpg

Juude avatar May 09 '25 23:05 Juude

Image

Image

No, sampler was "mixed" already.

Re-downloading model and setting options like the ones you provided didn't help.

I also tried playing with Min-P and Top-K, but it didn't help either.

I am testing the model with default (empty ?) system promt and an input Как варить пельмени?.

iAdanos avatar May 10 '25 05:05 iAdanos

Screenshot_2025-05-10-16-13-15-98.jpg

the output is very long.but finally seems output right

Juude avatar May 10 '25 08:05 Juude

Very strange.

On my device (Samsung S23 Ultra) it still loops even after app update and model re-download:

Image

Image

Image

Image

Image

JFYI, I stopped it looping manually.

iAdanos avatar May 11 '25 08:05 iAdanos

Maybe switching to high inference precision could fix the problem. I’ll probably add that setting in the next release.

Juude avatar May 14 '25 12:05 Juude

Changing Diffusion Memory Mode and updating an app again didn't change anything, BTW

iAdanos avatar May 14 '25 12:05 iAdanos

Maybe switching to high inference precision could fix the problem. I’ll probably add that setting in the next release.

also try to set sampler to "penalty" and set the Penalty value to "1.2" for a try

Juude avatar May 14 '25 12:05 Juude

Changing Diffusion Memory Mode and updating an app again didn't change anything, BTW

Diffusion Memory Mode will only affect stable diffusion models.

Juude avatar May 14 '25 12:05 Juude

Maybe switching to high inference precision could fix the problem. I’ll probably add that setting in the next release.

How should I switch it?

iAdanos avatar May 14 '25 18:05 iAdanos

Maybe switching to high inference precision could fix the problem. I’ll probably add that setting in the next release.

also try to set sampler to "penalty" and set the Penalty value to "1.2" for a try

Penalty sampler changed a picture a lot:

Image

Without opencl an answer is much longer and it tried to get looped, but managed to get out:

Image

Image

With opencl it's much shorter (It would be great to have some guide for users on how such options change behavior), much slower, but it managed to not to loop on first shot:

Image

Image

iAdanos avatar May 14 '25 18:05 iAdanos

For follow-ups, with opencl enabled it started thinking, but does not produce "out loud" (non-thinking) output.

iAdanos avatar May 14 '25 19:05 iAdanos

@wangzhaode please explain dfferent behavior of opencl and cpu

Juude avatar May 15 '25 15:05 Juude

Marking as stale. No activity in 60 days.

github-actions[bot] avatar Jul 15 '25 10:07 github-actions[bot]