verl qwen3-vl-8b训练

System Info

同样的训练配置，但是两个模型的表现效果差异很大，qwen3-vl-8b会经常波动，用的verl仓库代码是10月初的，不知道是不是框架的原因

Information

[ ] The official example scripts
[x] My own modified scripts

Tasks

[x] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

nope

Expected behavior

准备更新一下框架代码试试

Nov 19 '25 12:11 SupreCyk

System Info

同样的训练配置，但是两个模型的表现效果差异很大，qwen3-vl-8b会经常波动，用的verl仓库代码是10月初的，不知道是不是框架的原因

Information

[ ] The official example scripts[x] My own modified scripts

Tasks

[x] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)[ ] My own task or dataset (give details below)

Reproduction

nope

Expected behavior

准备更新一下框架代码试试

请教大佬，现在的版本支持 qwen3-vl 架构了吗？我用的版本可能有点旧，不支持

Nov 21 '25 03:11 GmailF

System Info

同样的训练配置，但是两个模型的表现效果差异很大，qwen3-vl-8b会经常波动，用的verl仓库代码是10月初的，不知道是不是框架的原因

Information

[ ] The official example scripts[x] My own modified scripts

Tasks

[x] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)[ ] My own task or dataset (give details below)

Reproduction

nope

Expected behavior

准备更新一下框架代码试试

请教大佬，现在的版本支持 qwen3-vl 架构了吗？我用的版本可能有点旧，不支持

我现在verl的版本是分支V0.6.1,照着安装教程重走了一遍，可以运行，但是qwen3-vl-4b和8b还是会这样一直保持高位抖动

Nov 21 '25 04:11 SupreCyk

请问现在解决了吗，是什么问题呀

Nov 25 '25 14:11 bug-to-share

请问现在解决了吗，是什么问题呀

我不知道是不是现在还没支持fsdp的训练方式，我是直接拿的https://github.com/volcengine/verl/blob/main/examples/grpo_trainer/run_qwen2_5_vl-7b.sh 换了下模型然后跑的

Nov 26 '25 12:11 SupreCyk