Chunjiang Ge (葛春江)

https://john-ge.github.io/

清华大学 Beijing Ph.d Student of Tsinghua University. A member of @LeapLabTHU. Major in LMM, AI Agent and Embodied AI.

Results 39 comments of


                                            Chunjiang Ge (葛春江)

unable to guess the VisDA-2017 dataset structure for semantic segmentation task

Our project only supports classification now.

unable to guess the VisDA-2017 dataset structure for semantic segmentation task

Maybe some papers following our work develop method for segmentation. You could check it.

model config和generation config不对齐

出现这个提醒不影响性能。你加了额外的special token对性能的影响取决于你的处理

如何用video的图像帧进行训练

训练的时候设定fps可以直接读取视频抽帧训练。

请问Qwen3-vl 微调lora模型如何进行合并呢？之前使用llama-factory lora微调后都会与原模型合并，这个官方微调的lora模型如何合并呢？还是说不需要合并？求指点

我们是 peft 支持的 lora，可以按照上面的方式合并

qwen3-vl fintune bbox 如何处理？

3vl 用的是相对坐标，可以参考 2vl 的

Multi node training

You could speed up training by setting: ```bash --data_flatten True \ ``` And reduce the max pixels.

Configs for training VIT architecture

Training with ViT-B should use AdamW as the optimizer. You could try some different lr settings.

How to finetue 8B model with fully parameters on 4*40G A100

8B model, zero3 could run.

How to finetue 8B model with fully parameters on 4*40G A100

Try to reduce model max length to 4k.

‹
1
2
3
4
›