InternVL icon indicating copy to clipboard operation
InternVL copied to clipboard

ValueError: Tokenizer class InternLM2Tokenizer does not exist or is not currently imported.

Open threegold116 opened this issue 1 year ago • 3 comments

error information

Traceback (most recent call last):
  File "/home/sxjiang/project/InternVL/test_chat.py", line 13, in <module>
    tokenizer = AutoTokenizer.from_pretrained(path)
  File "/home/sxjiang/miniconda3/envs/internvl/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 784, in from_pretrained
    raise ValueError(
ValueError: Tokenizer class InternLM2Tokenizer does not exist or is not currently imported.

test_chat.py

import torch
from PIL import Image
from transformers import AutoModel, CLIPImageProcessor
from transformers import AutoTokenizer

path = "OpenGVLab/InternVL-Chat-V1-5"
model = AutoModel.from_pretrained(
    path,
    torch_dtype=torch.bfloat16,
    low_cpu_mem_usage=True,
    trust_remote_code=True).eval().cuda()

tokenizer = AutoTokenizer.from_pretrained(path)
image = Image.open('./examples/image2.jpg').convert('RGB')
image = image.resize((448, 448))
image_processor = CLIPImageProcessor.from_pretrained(path)

pixel_values = image_processor(images=image, return_tensors='pt').pixel_values
pixel_values = pixel_values.to(torch.bfloat16).cuda()

generation_config = dict(
    num_beams=1,
    max_new_tokens=512,
    do_sample=False,
)

question = "请详细描述图片"
response = model.chat(tokenizer, pixel_values, question, generation_config)

envs

transformers           4.36.2

threegold116 avatar Apr 25 '24 13:04 threegold116

hello 请问

error information

Traceback (most recent call last):
  File "/home/sxjiang/project/InternVL/test_chat.py", line 13, in <module>
    tokenizer = AutoTokenizer.from_pretrained(path)
  File "/home/sxjiang/miniconda3/envs/internvl/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 784, in from_pretrained
    raise ValueError(
ValueError: Tokenizer class InternLM2Tokenizer does not exist or is not currently imported.

test_chat.py

import torch
from PIL import Image
from transformers import AutoModel, CLIPImageProcessor
from transformers import AutoTokenizer

path = "OpenGVLab/InternVL-Chat-V1-5"
model = AutoModel.from_pretrained(
    path,
    torch_dtype=torch.bfloat16,
    low_cpu_mem_usage=True,
    trust_remote_code=True).eval().cuda()

tokenizer = AutoTokenizer.from_pretrained(path)
image = Image.open('./examples/image2.jpg').convert('RGB')
image = image.resize((448, 448))
image_processor = CLIPImageProcessor.from_pretrained(path)

pixel_values = image_processor(images=image, return_tensors='pt').pixel_values
pixel_values = pixel_values.to(torch.bfloat16).cuda()

generation_config = dict(
    num_beams=1,
    max_new_tokens=512,
    do_sample=False,
)

question = "请详细描述图片"
response = model.chat(tokenizer, pixel_values, question, generation_config)

envs

transformers           4.36.2

hello 请问这个解决了吗

a798047815 avatar Apr 26 '24 04:04 a798047815

hello 请问

error information

Traceback (most recent call last):
  File "/home/sxjiang/project/InternVL/test_chat.py", line 13, in <module>
    tokenizer = AutoTokenizer.from_pretrained(path)
  File "/home/sxjiang/miniconda3/envs/internvl/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 784, in from_pretrained
    raise ValueError(
ValueError: Tokenizer class InternLM2Tokenizer does not exist or is not currently imported.

test_chat.py

import torch
from PIL import Image
from transformers import AutoModel, CLIPImageProcessor
from transformers import AutoTokenizer

path = "OpenGVLab/InternVL-Chat-V1-5"
model = AutoModel.from_pretrained(
    path,
    torch_dtype=torch.bfloat16,
    low_cpu_mem_usage=True,
    trust_remote_code=True).eval().cuda()

tokenizer = AutoTokenizer.from_pretrained(path)
image = Image.open('./examples/image2.jpg').convert('RGB')
image = image.resize((448, 448))
image_processor = CLIPImageProcessor.from_pretrained(path)

pixel_values = image_processor(images=image, return_tensors='pt').pixel_values
pixel_values = pixel_values.to(torch.bfloat16).cuda()

generation_config = dict(
    num_beams=1,
    max_new_tokens=512,
    do_sample=False,
)

question = "请详细描述图片"
response = model.chat(tokenizer, pixel_values, question, generation_config)

envs

transformers           4.36.2

hello 请问这个解决了吗

transformers版本升级就好了,我升级成4.40.0就行了

wanghanyang123 avatar Apr 26 '24 04:04 wanghanyang123

这是因为InternLM2的tokenizer没有被包含在transformers里,我建议transformers的版本还是用4.36.2,然后参考这里的代码来运行:https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5#model-usage

具体来说解上面这个问题就是用tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True) 加一个 trust_remote_code=True

czczup avatar Apr 26 '24 17:04 czczup