JJJYmmm
JJJYmmm
Auto-regress learning need to remove the last one. For example. for sequence [a, b, c, d, e], decoder's input is [bos, a, b, c, d, e], the desired output is...
attn.weight(projection matrix) is independent of T(or block_size).
I think it's the last-layer embedding(hidden_states, before logits) corresponding to the \ token. You can reference LISA https://github.com/dvlab-research/LISA.
Another question. When computing the loss `AdjustLabelSmoothedCrossEntropyCriterion`, `sample_patch_num` is added into the model input(sample[0], which I think is correspond to sample_v1, the vision-language data) https://github.com/OFA-Sys/OFA/blob/a36b91ce86ff105ac8d9e513aa88f42b85e33479/criterions/label_smoothed_cross_entropy.py#L177-L178 It seems that `sample_patch_num` can...
I have the same problem. : )
Maybe it's just a coords normalization operation in both training and prediction. However, when using `bin2coord`, it causes the coordinates to go out of the image(`task.cfg.max_image_size >= task.cfg.patch_image_size`). ```python def...
> In my environment, this problem is caused by invoking the llava package in the pip environment. So, I solved this problem by adding the local file llava to the...
Update `get_peft_state_maybe_zero_3` to save lora_bias correctly. https://github.com/haotian-liu/LLaVA/pull/1414/commits/418a53c8b7d283291ea383a9d4412f0403a2fd64
> Hello, I ran the test file and encountered the following problem, do you have any good solution: > > Traceback (most recent call last): File "C:/Users/fc747/Desktop/Pix2Seq-master/test.py", line 136, in...