JJJYmmm
JJJYmmm
I try to fix test.py which mentioned in #7
Hi Shariatnia, thanks for your tutorial! I have some question about variable `max_len`. I see `max_len` first in class Tokenizer,I think the role of it is to limit the maximum...
There seems to be a problem in train.py. ``` total_steps = (len(trainloader) // args.batch_size + 1) * args.epoches ``` len(train_loader) is already divided by batch_size. Change it to `total_steps =...
Fix variable naming errors, embedding_dim and n_embed
Since the transformer take in the quantified image token generated by VQGAN, which codebook has indices (0~n_embed-1), and transformer’s sos token is also set to zero defaultly. Could you tell...
In the original paper, the TF is defined as blow.  But in the code, the denominator seems to be ignored. https://github.com/tylin/coco-caption/blob/3a9afb2682141a03e1cdc02b0df6770d2c884f6f/pycocoevalcap/cider/cider_scorer.py#L124
## Code snippet https://github.com/huggingface/transformers/blob/11afab19c0e4b652855f9ed7f82aa010c4f14754/src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py#L1792-L1800 ## Related issue https://github.com/hiyouga/LLaMA-Factory/issues/6910 ## solution modify line 1792 to `self.rope_deltas = rope_deltas.to(cache_position.device)` ```python position_ids, rope_deltas = self.get_rope_index( input_ids, image_grid_thw, video_grid_thw, second_per_grid_ts, attention_mask, ) self.rope_deltas =...
### Reminder - [x] I have read the above rules and searched the existing issues. ### Description As mention in https://github.com/hiyouga/LLaMA-Factory/issues/6844#issuecomment-2644439667, the cutoff of multimodal sequence should not remove the...