JJJYmmm

Results 18 issues of JJJYmmm

I try to fix test.py which mentioned in #7

Hi Shariatnia, thanks for your tutorial! I have some question about variable `max_len`. I see `max_len` first in class Tokenizer,I think the role of it is to limit the maximum...

There seems to be a problem in train.py. ``` total_steps = (len(trainloader) // args.batch_size + 1) * args.epoches ``` len(train_loader) is already divided by batch_size. Change it to `total_steps =...

Fix variable naming errors, embedding_dim and n_embed

Since the transformer take in the quantified image token generated by VQGAN, which codebook has indices (0~n_embed-1), and transformer’s sos token is also set to zero defaultly. Could you tell...

In the original paper, the TF is defined as blow. ![image](https://github.com/user-attachments/assets/f782b5c4-11c3-4f74-b6ee-8694a8bd621f) But in the code, the denominator seems to be ignored. https://github.com/tylin/coco-caption/blob/3a9afb2682141a03e1cdc02b0df6770d2c884f6f/pycocoevalcap/cider/cider_scorer.py#L124

## Code snippet https://github.com/huggingface/transformers/blob/11afab19c0e4b652855f9ed7f82aa010c4f14754/src/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py#L1792-L1800 ## Related issue https://github.com/hiyouga/LLaMA-Factory/issues/6910 ## solution modify line 1792 to `self.rope_deltas = rope_deltas.to(cache_position.device)` ```python position_ids, rope_deltas = self.get_rope_index( input_ids, image_grid_thw, video_grid_thw, second_per_grid_ts, attention_mask, ) self.rope_deltas =...

### Reminder - [x] I have read the above rules and searched the existing issues. ### Description As mention in https://github.com/hiyouga/LLaMA-Factory/issues/6844#issuecomment-2644439667, the cutoff of multimodal sequence should not remove the...

enhancement
pending