Yu-won Lee

Results 230 comments of Yu-won Lee

I haven't tested with the base model. I think it won't be that much different but I'm not sure. I'll check the template and hyper parameters for it.

Sorry for the inconvinience. I think some weights are merged or loaded wrong. I'll check for the problem.

Yes, sure. If make an PR then I'll review it.

Yes it uses the technique. Becuase it's just a fine-tuning work so, I haven't modified the logic itself. Maybe you could just use load the vision part of the model...

@Mike-ihr The repository only includes fine-tuning code and doesn’t provide code to construct the model from scratch. Based on the paper, the authors performed end-to-end contrastive pretraining. A practical approach...

I've updated the code to only save the checkpoints in the main rank. To just skip the non-saving rank.

I hanve't tried but, I don't think it should work, becuase I save the non-lora-weights in another form. The best way is the merge the lora weight and use make...

It seems the setting for the inference and training is different. Check for the patch size and other things.

Because if you see here, sft uses the resize in the dataset it self. https://github.com/2U1/Qwen-VL-Series-Finetune/blob/93fef86285e442b1ed80a5c6f2724e6e06d3cde5/src/dataset/sft_dataset.py#L191-L200 However, in GRPO this process is held in the trainer so, I've set the option...

Sorry for the issue, I wasn't updating the serving file so I need to check the problem. I'll check the problem.