a-PyTorch-Tutorial-to-Image-Captioning issues

can this model detect and recognize text in images containing text

1

Hello, can this model detect and recognize text in images containing text

Only the author mentions half the bleu-4 score

3

The model was trained on flicker8k, but the results achieved only half the BLEU-4 score mentioned by the authors (about 0.14-0.15). I have not modified any parameters in train.py. May...

CallMeRabbit

RuntimeError: CUDA error: device-side assert triggered

How to slove this problem in eval.py?

19856100635

ValueError: max() arg is an empty sequence

3

I encounter with a bug which is "**ValueError: max() arg is an empty sequence**" when I run **caption.py**. I find that the parameter **complete_inds** and **complete_seqs_scores** is null during the...

loserlulin9

Please help me with the error:RuntimeError: Expected target size [32, 9490], got [32, 51]

1

Traceback (most recent call last): File "D:\majority_design\image_caption\ic_train.py", line 322, in main() File "D:\majority_design\image_caption\ic_train.py", line 116, in main train(train_loader=train_loader, File "D:\majority_design\image_caption\ic_train.py", line 180, in train loss = criterion(scores, targets) File "D:\anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py",...

2068439567

In eval.py, it seems that for each image, the model has to recalculate 5 times, is it too inefficient?

fiora6

Example Notebook??

1

Silly question. I'm not following how to implement this model? Is there a step by step example notebook anywhere in which I can review?

quaid281

Dataset not available

3

I have tried to download the dataset but the link does not redirect anywhere. Maybe you have the dataset in another place or can you upload it to another place...

Leo-Thomas

High loss and low bleu-4 for training

7

When I train a new model in flickr8k and flickr30k dataset in my environment, I find that the **trianing loss is too high**(about 10) and the **bleu-4 is too low**(about...

loserlulin9

Anyone , Please help to solve this error

3

File "train.py", line 329, in main() File "train.py", line 116, in main epoch=epoch) File "train.py", line 184, in train loss += alpha_c * ((1. - alphas.sum(dim=1)) ** 2).mean() UnboundLocalError: local...

MCA-eng

a-PyTorch-Tutorial-to-Image-Captioning
a-PyTorch-Tutorial-to-Image-Captioning copied to clipboard

Metadata

can this model detect and recognize text in images containing text

Only the author mentions half the bleu-4 score

RuntimeError: CUDA error: device-side assert triggered

ValueError: max() arg is an empty sequence

Please help me with the error:RuntimeError: Expected target size [32, 9490], got [32, 51]

In eval.py, it seems that for each image, the model has to recalculate 5 times, is it too inefficient?

Example Notebook??

Dataset not available

High loss and low bleu-4 for training

Anyone , Please help to solve this error

← Metadata

Owner

Metadata

a-PyTorch-Tutorial-to-Image-Captioning a-PyTorch-Tutorial-to-Image-Captioning copied to clipboard

Metadata

← Metadata

Owner

Metadata

a-PyTorch-Tutorial-to-Image-Captioning
a-PyTorch-Tutorial-to-Image-Captioning copied to clipboard