Kingsley comments

Results 80 comments of


                                            Kingsley

Qwen3-VL-8B和4B训练时候的GPU使用率低下

大家可以在这里分享一下自己的硬件环境，环境设置以及数据集特点看看是哪里有问题，在我们自己的测试过程中没有发现特别严重的速度问题

Qwen3-VL-8B和4B训练时候的GPU使用率低下

> 安装llama-factory的requirements也出现了同样训练很慢的问题, 重新安装了QwenVL官方的环境, 训练速度快了很多, GPU利用率90%以上了是qwen官方的finetune代码吗

qwen3_nothink 训练之后还会偶发 think

Did you use `qwen3_nothink` for Qwen3 training, not Qwen3-Instruct? If yes, it is expected.

更新到最新版后，webui没有gemma-3选项了

> > > ### Reminder > > > > > > > > > * [x] I have read the above rules and searched the existing issues. > > >...

Support Pixtral-12B

> If you eval with FA2 enabled during training, it errors with > > > ValueError: You are attempting to perform batched generation with padding_side='right' this may lead to unexpected...

Support Pixtral-12B

> If you eval with FA2 enabled during training, it errors with > > > ValueError: You are attempting to perform batched generation with padding_side='right' this may lead to unexpected...

[scripts] add vllm judge script

I am not familiar with **LLM-judge**, but maybe this [template](https://github.com/open-compass/opencompass/blob/c3779ebfc14d872ca726c81e26fc7d2a813731d6/examples/eval_llm_judge.py#L42C1-L65C12) from _OpenCompass_ can also be used as a reference.

[scripts] add vllm judge script

> Thanks for the suggestion! Should I update this script to support a more general ACC-style evaluation (e.g., by counting Correct/Incorrect cases)? I think it is okay to have a...