TCL
TCL copied to clipboard
Question about VQA fine-tuning
Hi Jinyu,
Thanks for sharing the code of the great work TCL. I have some questions about the code of model_vqa.py
.
1. top k answers for each question, shouldn't the code be answer_ids[b]
and answer_atts[b]
?
2. use of text decoder, based on targets_ids = input_ids.masked_fill(input_ids == self.tokenizer.pad_token_id, -100)
, the input_ids
are almost the same as targets_ids
except the pad token id, so what's the point of calculating loss and generating the answer for the second time?
Thanks!