image-captioning issues

Ensemble code and how to use senet154

1

你好，请问可以提供ensemble的代码吗，还有论文说的以senet154为backbone是指将faster rcnn的backbone换成了senet 还是直接用它提取特征？有提取好的特征数据吗？

searchstorm

How to generate the coco_train_input.pkl and coco_train_target.pkl?

xuyueyuanx

How to generate the visualization of attended image regions along the caption generation processes.

3

I have a question for how to generate the visualization of attended image regions along the caption generation processes. Would you mind releasing some codes?

Archer-Fang

About the config of the updown model

Thank you very much for making the code open source I can find 4 configs about xlan and transformer in the experiment folder Can you provide the basic model(updown) of...

WangLanxiao

Test results are all 0 when using the author's checkpoint

2

1) I used the provided caption_model_47.pth from xlan experiment and the following commands to do the test. However, **all test metrics are 0 when I used the def decode_beam which...

XuMengyaAmy

How to view the visualization?

yrr12

Can I use your RL technique on a model which take in input global feature image?

The model I'm training is fed with global region feature of each image, it doesn't use local feature based on regions. Do you think I can apply your technique to...

vinevix

Vocabulary from test split

Hi! Thanks for the written paper and the availabe code. I have what may be a stupid question, but I didn't find a straight answer to it anywhere: When evaluating...

gondimjoaom

Un-expected result is showing, please help. xlan_rl

[INFO: 2024-03-16 05:55:12,296] ######## Epoch (VAL)2 ######## [INFO: 2024-03-16 05:55:12,296] {'Bleu_1': 0.10388235294117525, 'Bleu_2': 0.015665952215665777, 'Bleu_3': 0.0023566989464306317, 'Bleu_4': 1.1693743696593336e-07, 'METEOR': 0.06880237033174463, 'ROUGE_L': 0.10528904602185862, 'C IDEr': 0.05029230331138939, 'SPICE': 0.04714478772067636} [INFO: 2024-03-16 06:00:21,652]...

shamimsareem

image-captioning
image-captioning copied to clipboard

Metadata

Ensemble code and how to use senet154

How to generate the coco_train_input.pkl and coco_train_target.pkl?

How to generate the visualization of attended image regions along the caption generation processes.

About the config of the updown model

Test results are all 0 when using the author's checkpoint

Ddp

How to view the visualization?

Can I use your RL technique on a model which take in input global feature image?

Vocabulary from test split

Un-expected result is showing, please help. xlan_rl

← Metadata

Owner

Metadata

image-captioning image-captioning copied to clipboard

Metadata

← Metadata

Owner

Metadata

image-captioning
image-captioning copied to clipboard