sbkim052
sbkim052
I can't either. sorry
hi, @wyp19930313 I have the same questions as both of your questions. Have you found the answers?
@1icas did you find it?
Thank you very much. I have an additional question. In your code, i found out that the top down model was implemented by Transformer. Is there any implemented pretrained model...
Then what does the transformer used for? I thought the transformer was used for top down model
In the colab code, what i understood was that after extracting features from maskrcnn, the vector goes through the encoder of the transformer. After that, each word is generated by...
I thought this Image captioning model in colab works by 1) extracting 100 detection boxes from maskrcnn 2) Generating words by attention model(Transformer) With the above process, I thought this...