Xinlong Yang comments

Results 9 comments of


                                            Xinlong Yang

视频token占用数疑问

时序维度上也有merge，所以实际上是200 * 144也就是28800；qwen2vl3个维度上均有merge

Why does training eagle with my own data perform worse than medusa

May be the output is too short (...

Is there more detailed documentation for HF AutoTP training?

> I don't see `get_tensor_model_parallel_group()` used in either transformers or accelerate in the context of deepspeed (just megatron in accelereate), so I'm not sure how this should work in the...

Is there more detailed documentation for HF AutoTP training?

> > I don't see `get_tensor_model_parallel_group()` used in either transformers or accelerate in the context of deepspeed (just megatron in accelereate), so I'm not sure how this should work in...

Qwen2-VL-7B-Instruct测试cmc和mAP报错

你的循环里面图片输入bs是11吗？似乎没对上grid_t这个维度，正常是需要插值到12然后被2整除，得到(6,2,3,8,2,14,8,2,14)才对，你看看你_preprocess()函数里面有下面的代码段吗？ ![Image](https://github.com/user-attachments/assets/f7d8ae1c-b0dd-408d-a365-1c6611da773b)

Qwen2-VL-7B-Instruct测试cmc和mAP报错

> > 你的循环里面图片输入bs是11吗？似乎没对上grid_t这个维度，正常是需要插值到12然后被2整除，得到(6,2,3,8,2,14,8,2,14)才对，你看看你_preprocess()函数里面有下面的代码段吗？ > > ![Image](https://github.com/user-attachments/assets/f7d8ae1c-b0dd-408d-a365-1c6611da773b) > > 欸你好我也看到了grid有三个维度但是我不是很懂应该怎么处理这三个值这个值是根据图片size和图片个数计算的，qwen2-vl会将一张图片视为连续两个相同的帧；qwen2-vl原本读取图片是一张一张进行处理的，每张都会复制一份（视为连续两个相同的帧），而你这个是直接把batched图片输入进去了，它当成整体处理了，维度就有问题，应该正常是把你这里批次的images打散成list，每个元素是一张图片传进去，然后qwen2vl自己会处理。

Xinlong Yang

视频token占用数疑问

Why does training eagle with my own data perform worse than medusa

Is there more detailed documentation for HF AutoTP training?

Is there more detailed documentation for HF AutoTP training?

Qwen2-VL-7B-Instruct测试cmc和mAP报错

Qwen2-VL-7B-Instruct测试cmc和mAP报错

What is the data format of the training data?

Inquiry about attention mask used for EAGLE-3 Training

Qwen2_5_VL_72B，输入tokens数量问题