InternVL
InternVL copied to clipboard
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Hello, are there plans to open-source the dataset of InternVL 1.5?
与官方安装不同点: 1. 由于服务器无法链接git,flash_attn使用了轮子flash_attn-2.3.6+cu118torch2.0cxx11abiFALSE-cp39-cp39-linux_x86_64.whl进行安装 2. 由于transformers-4.36.2 报错无法找到“InternLM2Tokenizer” 安装了 transformers-4.37.0 代码内容: ``` import torch from PIL import Image from transformers import AutoModel, CLIPImageProcessor from transformers import AutoTokenizer path = "/home/yangsun/checkpoint/InternVL-Chat-V1-5" # from...
能支持下ollama部署吗
感谢做了这么好的工作。 我们用1.2 plus复现mathvista的效果,最好只有37%,跟公开的59.9%相差太远,辛苦指导下,是哪里没用对吗?
Thank you so much for sharing this fantastic work. Is there any plan or timeline to release the latest version of internvl-chat? Hope to hear from you soon.
local chat code: `import torch from PIL import Image from transformers import AutoModel, CLIPImageProcessor from transformers import AutoTokenizer path = "OpenGVLab/InternVL-Chat-Chinese-V1-2" model = AutoModel.from_pretrained( path, torch_dtype=torch.bfloat16, low_cpu_mem_usage=True, trust_remote_code=True, device_map='auto').eval() tokenizer...
请问面向Grounding任务的V1.2-plus的微调,prompt的模版是否是以下的格式: 'Please provide the bounding box coordinate of the region this sentence describes: XXX', (这个prompt是参考自refcoco的评测脚本https://github.com/OpenGVLab/InternVL/blob/main/internvl_chat/eval/refcoco/evaluate_grounding.py#L248C18-L248C104 ) 即数据的格式是否如下: ``` { "id": 0, "image": "images/5.png", "conversations": [ { "from": "human", "value": "\nPlease...
I'm very interested in your work. The chapter on retrieval fine-tuning provides the corresponding script at https://github.com/OpenGVLab/InternVL/tree/main/internvl_g, which uses 32 A100 GPUs. May I ask how long it would take...
您好 我想在langchian框架下 使用internvl 来完成对话 以及memory的记忆 path = "OpenGVLab/InternVL-Chat-Chinese-V1-1" model = AutoModel.from_pretrained( path, torch_dtype=torch.bfloat16, low_cpu_mem_usage=True, trust_remote_code=True, cache_dir = "/project/ASD/jingyou_llm/model_cache", ).eval().cuda() tokenizer = AutoTokenizer.from_pretrained(path) pipe = pipeline( "visual-question-answering", model=model, tokenizer=tokenizer, max_length=100, )...
First of all, thank you very much for your work, which is very constructive. I would like to ask how to deploy this model and turn it into an API...