Chunjiang Ge (葛春江)
Chunjiang Ge (葛春江)
Our project only supports classification now.
Maybe some papers following our work develop method for segmentation. You could check it.
出现这个提醒不影响性能。 你加了额外的special token对性能的影响取决于你的处理
训练的时候设定fps可以直接读取视频抽帧训练。
我们是 peft 支持的 lora,可以按照上面的方式合并
3vl 用的是相对坐标,可以参考 2vl 的
You could speed up training by setting: ```bash --data_flatten True \ ``` And reduce the max pixels.
Training with ViT-B should use AdamW as the optimizer. You could try some different lr settings.
8B model, zero3 could run.
Try to reduce model max length to 4k.