mPLUG
mPLUG copied to clipboard
Question about baseline reward in `caption_mplug_scst.py`
The code in this repo shows that baseline reward is calculated by averaging reward of generated captions. However, the original version of scst as well as some other scst implementation (e.g., in VALOR) calculate the baseline reward with greedy-search-generated caption. Is there any reference or explanation about current implementation in this repo? Really appreciate it if I obtain any help.