recognize-anything icon indicating copy to clipboard operation
recognize-anything copied to clipboard

`UnpicklingError: invalid load key, 'v'.` when loading the model

Open SkalskiP opened this issue 1 year ago • 0 comments

I try to run this code:

import torch
import torchvision.transforms as transforms
from models.tag2text import tag2text_caption, ram

DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
IMAGE_SIZE = 384
CHECKPOINT_RAM = "ram_swin_large_14m.pth"

normalize = transforms.Normalize(
    mean=[0.485, 0.456, 0.406],
    std=[0.229, 0.224, 0.225]
)
transform = transforms.Compose([
    transforms.Resize((IMAGE_SIZE, IMAGE_SIZE)),
    transforms.ToTensor(),
    normalize
])

model_ram = ram(pretrained=CHECKPOINT_RAM, image_size=IMAGE_SIZE, vit='swin_l' )

model_ram.eval()
model_ram = model_ram.to(DEVICE)

And get this exception:

/encoder/layer/0/crossattention/self/query is tied
/encoder/layer/0/crossattention/self/key is tied
/encoder/layer/0/crossattention/self/value is tied
/encoder/layer/0/crossattention/output/dense is tied
/encoder/layer/0/crossattention/output/LayerNorm is tied
/encoder/layer/0/intermediate/dense is tied
/encoder/layer/0/output/dense is tied
/encoder/layer/0/output/LayerNorm is tied
/encoder/layer/1/crossattention/self/query is tied
/encoder/layer/1/crossattention/self/key is tied
/encoder/layer/1/crossattention/self/value is tied
/encoder/layer/1/crossattention/output/dense is tied
/encoder/layer/1/crossattention/output/LayerNorm is tied
/encoder/layer/1/intermediate/dense is tied
/encoder/layer/1/output/dense is tied
/encoder/layer/1/output/LayerNorm is tied
--------------
ram_swin_large_14m.pth
--------------
---------------------------------------------------------------------------
UnpicklingError                           Traceback (most recent call last)
[<ipython-input-16-8a868b36a5c6>](https://localhost:8080/#) in <cell line: 8>()
      6 # model_tag2text = model_tag2text.to(DEVICE)
      7 
----> 8 model_ram = ram(pretrained=CHECKPOINT_RAM, image_size=IMAGE_SIZE, vit='swin_l' )
      9 
     10 model_ram.eval()

3 frames
[/usr/local/lib/python3.10/dist-packages/torch/serialization.py](https://localhost:8080/#) in _legacy_load(f, map_location, pickle_module, **pickle_load_args)
   1031             "functionality.")
   1032 
-> 1033     magic_number = pickle_module.load(f, **pickle_load_args)
   1034     if magic_number != MAGIC_NUMBER:
   1035         raise RuntimeError("Invalid magic number; corrupt file?")

UnpicklingError: invalid load key, 'v'.

I'm running my code in Google Colab. Here is the link to my code: https://colab.research.google.com/drive/155jbQL31PrKxRrEq0V8TSs8KdreYitZC?usp=sharing

Would be awesome if you could help me to set it up in a notebook. I'm thinking of making a tutorial about it.

SkalskiP avatar Jun 09 '23 17:06 SkalskiP