Junyang Lin
Junyang Lin
大于等于3吧
可能是数据有误,我没有遇到过这种情况,你确认下测试集和LCSTS的是否一致
I think this may come from the conflict of the latest merge for prompt tuning, which add some `.size()`. Let us take some time and fix it as soon as...
Going to release those prompts, but I can inform you here right now. For visual grounding, we use `这段文字" {} "描述的是哪个区域?`, and for captioning, we use `图片描述了什么?`. You can continue...
Which checkpoint did you use? Pretrained checkpoints? How about using finetuned checkpoints to check the results of visual grounding? (Or later I can build a demo for OFA-CN) This is...
This is RefCOCO series translated into Chinese by the in-house translation system. We'll see if it is possible to release it.
Does the original code work well now? Any more details about how you make changes? Or you can share your colab notebook to us.
I think the problem might mainly come from the position embedding of the image. See this:  For a short story, I pretrained the huge model with the resolution of...
BTW the codes are in `unify_transformer.py`
Just for inference? I think it should be fine for a base model, whose computation is slightly larger than that of BERT-base.