image-captioning
image-captioning copied to clipboard
RuntimeError: The size of tensor a (3) must match the size of tensor b (9) at non-singleton dimension 0
my images are 256x256 pixels
/content/image-captioning 2023-09-02 18:30:18.889829: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT Device: cuda:0 Images found: 263 Split size: 263 Checkpoint loading... load checkpoint from ./checkpoints/model_large_caption.pth
Model to cuda:0
Inference started
0batch [00:01, ?batch/s]
Traceback (most recent call last):
File "/content/image-captioning/inference.py", line 88, in
I have found the solution, at least for myself personally: it might be a version mismatch with some of the modules, check the requirements text file, if any of those modules are newer on your system than the requirements in this file, there might be feature deprecation preventing this from running
I had this same exact error and fixed it with this command:
pip install timm==0.4.12 transformers==4.17.0 fairscale==0.4.4 pycocoevalcap pillow
It found that my timm, transformers and fairscale were on newer versions, pulled the downgrade, and got this working first try.
If you use these for anything else already and it might break functionality, it may not be worth it, unless you really need the functionality of this system.
EDIT: This error also crops up if you try to create a batch size larger than the number of image files being processed
Yeah please do pip install on an empty virtual env