Junyang Lin

Results 173 comments of Junyang Lin

Wait wait... Aren't you using `train_caption_stage1_base.sh` but instead `train_caption_stage1.sh`? I think that is because of the script. The arch of `train_caption_stage1.sh` is `ofa_large`, and thus you can't load a base...

Try gradient accumulation with `--update-freq`

`device_map='auto'` will automatically enables your model to run on multiple GPUs. If you would like to use only 1 GPU, you can set `device` or set the environment variable like...

Being frozen is quite necessary. I may prefer that people first finish the setup first, and then run the whole task (now it seems that everything is still a single...

Sorry, we do not have the permission.

Create a json file for your label set dictionary, or use the one I just uploaded.

修改配置文件或者代码都可以实现

这个可能和python版本有关系,你要不吧这行注释掉,然后随便设个term width,比如80

schesamp refers to schedule sampling and schedule refers to the schedule for learning rate decay

都是随机初始化,没有预训练