CogVLM2 issues

CogVLM2：cogvlm2-llama3-chinese-chat-19B 模型在 win11 x64 平台 4bit 加载下，运行报错。

4

### System Info / 系統信息环境状态如下： Windows 11 x64、Python 3.11.9、CUDA 12.1、Torch/torchvision/xformers/transformers/chainlit 关键依赖项，完全按照官方 requirements.txt 安装。后来根据系统提示，加装了：einops-0.8.0、triton-2.1.0、accelerate-0.30.1、psutil-5.9.8系统环境路径设置： CUDA_HOME=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1 CUDA_VISIBLE_DEVICES=0 为了 4bit 量化加载，修改了 web_demo.py 脚本中模型加载部分的参数，具体如下，原脚本： from transformers import AutoModelForCausalLM, AutoTokenizer,...

wikeeyang

支持 VLLM 推理加速

### Feature request / 功能建议希望 [CogVLM2](https://github.com/THUDM/CogVLM2) 能够支持 vllm 推理加速 ### Motivation / 动机推理速度更快 ### Your contribution / 您的贡献 https://github.com/vllm-project/vllm

justStarG

文本较多的场景，想实现 OCR-free的文本提取和表格重建，是不是需要微调才能有好的效果？

15

### Feature request / 功能建议目前的预训练模型，简单的图片可以，如果文字较多，效果就不太行。。 ### Motivation / 动机 .... ### Your contribution / 您的贡献 ...

chopin1998

[FEATURE] GGUF variant?

2

### Feature request / 功能建议 Please create a GGUF variant since this is the defacto standard for running models locally. ### Motivation / 动机 GGUF will make the model more...

AdaptiveStep

ERROR: Cannot install -r basic_demo/requirements.txt (line 7) and uvicorn>=0.29.0 because these package versions have conflicting dependencies.

2

### System Info / 系統信息用conda创建的新环境，python=3.11，CUDA=12.1，报以上错误QAQ ### Who can help? / 谁可以帮助到您？ _No response_ ### Information / 问题信息 - [ ] The official example scripts / 官方的示例脚本 - [ ]...

Xiongjiba

Wondering whether CogVLM2 supports SFT for multi-image QA in a sample

3

### Feature request / 功能建议 Hi, CogVLM2 team. Thank you for your brilliant work and this neat and easy-to-follow codebase. This morning, I've read through this repo quickly, and I...

Sprinter1999

error with multi_gpus inference

3

### System Info / 系統信息 hi, I have 4 3090 GPUs. device_map = infer_auto_device_map( model=model, max_memory={i: "12GiB" for i in range(torch.cuda.device_count())}, # set 23GiB for each GPU, depends on your...

GordonDongZHAO

能否实现版式复杂多变的表格类的图像的关键信息抽取任务

2

### Feature request / 功能建议类似于下图：（来自开放数据集）使用demo会返回几对关键信息对但是如果想要所有的关键信息是否需要针对特定数据集微调？这个示例只是简单的版式。 ### Motivation / 动机无 ### Your contribution / 您的贡献无

kkiskkk

提供的CogVLM2/basic_demo/openai_api_demo.py 无流式输出效果

### System Info / 系統信息 cuda12.1 torch2.3.0 ### Who can help? / 谁可以帮助到您？ @zr ### Information / 问题信息 - [X] The official example scripts / 官方的示例脚本 - [ ] My...

LRHstudy

CogVLM2的坐标定位能力如何

2

CogVLM2还是像CogVLM以及CogAgent都具有很强的坐标定位能力嘛？没有看到相关技术文档说明或者使用示例。

fang-h

CogVLM2
CogVLM2 copied to clipboard

Metadata

CogVLM2：cogvlm2-llama3-chinese-chat-19B 模型在 win11 x64 平台 4bit 加载下，运行报错。

支持 VLLM 推理加速

文本较多的场景，想实现 OCR-free的文本提取和表格重建，是不是需要微调才能有好的效果？

[FEATURE] GGUF variant?

ERROR: Cannot install -r basic_demo/requirements.txt (line 7) and uvicorn>=0.29.0 because these package versions have conflicting dependencies.

Wondering whether CogVLM2 supports SFT for multi-image QA in a sample

error with multi_gpus inference

能否实现版式复杂多变的表格类的图像的关键信息抽取任务

提供的CogVLM2/basic_demo/openai_api_demo.py 无流式输出效果

CogVLM2的坐标定位能力如何

← Metadata

Owner

Metadata

CogVLM2 CogVLM2 copied to clipboard

Metadata

← Metadata

Owner

Metadata

CogVLM2
CogVLM2 copied to clipboard