CogVLM
CogVLM copied to clipboard

→

Metadata

a state-of-the-art-level open visual language model | 多模态预训练模型

Reame
Issues

Results 84 CogVLM issues

Sort by recently updated

vllm适配

2

comment

### Feature request / 功能建议是否能支持vllm的适配? ### Motivation / 动机是否能支持vllm的适配? ### Your contribution / 您的贡献是否能支持vllm的适配?

加载报错

### System Info / 系統信息 conda python=3.11 ### Who can help? / 谁可以帮助到您？ _No response_ ### Information / 问题信息 - [X] The official example scripts / 官方的示例脚本 - [ ]...

openai_api.py调用的模型类型

1

comment

请问 openai_api.py加载hf的模型cogvlm-grounding-generalist-v1.1失败，再就是自己微调的sat模型可以用不？ Traceback (most recent call last): File "/data/liuchx/CogVLM-main/openai_demo/openai_api.py", line 382, in model = AutoModelForCausalLM.from_pretrained( File "/home/ubuntu/anaconda3/envs/cx_cogvlm/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 550, in from_pretrained model_class = get_class_from_dynamic_module( File "/home/ubuntu/anaconda3/envs/cx_cogvlm/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 501, in...

请问为什么使用cogvlm2的demo的时候无法使用grounding能力？

在使用cogvlm2的demo的时候，即使按照template中的prompt模板在问题末尾添加（with grounding），模型仍无法输出包含grounding位置信息的bounding box？ ![微信截图_20240619102355](https://github.com/THUDM/CogVLM/assets/69945923/ff5389a7-f7f3-416e-be14-71850e1a4c86) ![微信截图_20240619102420](https://github.com/THUDM/CogVLM/assets/69945923/01d3ce2e-28f1-4642-9546-4cc6a10cdcb6)

批量推理

grounding-generalist支持批量推理吗

我该使用什么格式的输入来用模型进行visual grounding 任务？

1

comment

我没有找到一个能稳定使得模型输出[x1,y1,x2,y2]的bounding box的方法，请问当时evaluation的代码还有吗

sat 模型中的 lm_head 和 transformer.word_embeddings 有什么区别

transformer.word_embeddings 在代码中的功能是计算最开始将token id转成embedding，最后输出计算token的feature相似度 lm_head呢？没找到具体的使用位置

Demo is dead, Streamlit link is not accessible

### System Info / 系統信息 ### Who can help? / 谁可以帮助到您？ _No response_ ### Information / 问题信息 - [ ] The official example scripts / 官方的示例脚本 - [ ] My...

AayushSameerShah

升级到 vlm2后，低像素图片识别效果明显变低，甚至不如老的 vm

### Feature request / 功能建议建议对低像素识别增加优化 ### Motivation / 动机入体，同一张图片，通义可以正确识别关键信息，vlm1 效果好于通义。现在升级到 vlm2后，很多信息都识别不了 ### Your contribution / 您的贡献入体

KeyError: 'cogvlm-chat'

### System Info / 系統信息 python 3.8，cuda11.8 ### Who can help? / 谁可以帮助到您？ @zr ### Information / 问题信息 - [X] The official example scripts / 官方的示例脚本 - [ ] My...

‹
1
2
3
4
5
6
7
8
9
›

About

a state-of-the-art-level open visual language model | 多模态预训练模型

language-model

pretrained-models

cross-modality

multi-modal

visual-language-models

5.9k

Stars

407

Forks

Watchers

Owner

← Metadata

5.9k

Stars

407

Forks

Watchers

Owner

Metadata

a state-of-the-art-level open visual language model | 多模态预训练模型