zhangfaen comments

Results 9 comments of


                                            zhangfaen

What exactly is the "supervised" task?

I have the same question.

What exactly is the "supervised" task?

@pGit1 see https://github.com/tatsu-lab/stanford_alpaca/blob/eb5b171d9b103a12a8e14e0edca9cbc45fe1d512/train.py#L131 labels = copy.deepcopy(input_ids) for label, source_len in zip(labels, sources_tokenized["input_ids_lens"]): label[:source_len] = IGNORE_INDEX label is not equal to input_id, because label[0:source_len] is set IGNORE_INDEX (source is "instruction +...

Has anyone tried the `Cautious Optimizer`?

For finetuning, add an alternative to LlamaFactory

https://github.com/zhangfaen/finetune-Qwen2-VL has 50+ stars and many repost in twitter. I think it is a good thing to let more people to know and use QwenLM/Qwen2-VL. Would you please merge this...

For finetuning, add an alternative to LlamaFactory

Thank you deeksha! Hope this PR will be merged to main branch soon.

For finetuning, add an alternative to LlamaFactory

> Cound you have a plan to add continue pretraining support ? I think Finetune itself is kind of continue pretraining.

For finetuning, add an alternative to LlamaFactory

Below is from Qwen2-vl tech report: ``` Following Qwen-VL (Bai et al., 2023b), we adopt a three-stage training methodology. In the first stage, we focus exclusively on training the Vision...

请问支持视觉定位吗？

> Thank you for your support of Qwen2_VL. The main reason for the poor fine-tuning grounding performance is that the 1D rope position embedding, rather than MRoPE, was used during...

when would you show how to fintune the Qwen2.5-VL model?

https://github.com/zhangfaen/finetune-Qwen2.5-VL