dinglei8908
Results
1
comments of
dinglei8908
> Try gradient accumulation with `--update-freq` thanks for suggestion. max mini batch size is 2/4 on V100/A100 for caption task, does this make sense?