ChatGLM2-6B [BUG/Help] model = AutoModel.from_pretrained("D:\\ChatGLM\\model\\2", trust_remote

[BUG/Help] model = AutoModel.from_pretrained("D:\\ChatGLM\\model\\2", trust_remote_code=True).cuda() 没有报错直接退出

Open zhans1099 opened this issue 1 year ago • 3 comments

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

1、运行“web_demo.py”到 model = AutoModel.from_pretrained("D:\ChatGLM\model\2", trust_remote_code=True) 时没有报错直接退出

2、添加.quantize(4).cuda()也不行

3、改为 model = AutoModel.from_pretrained("D:\ChatGLM\model\2", trust_remote_code=True, device='cuda') 则会报错： torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 108.00 MiB. GPU 0 has a total capacty of 11.00 GiB of which 6.39 GiB is free. Of the allocated memory 3.06 GiB is allocated by PyTorch, and 1.83 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

4、nvidia-smi 信息如下

Expected Behavior

No response

Steps To Reproduce

新下载配置的

Environment

- OS:Windows10
- Python:3.11.3
- Transformers:4.30.2
- PyTorch:2.1.0+cu118
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :True

Anything else?

No response

Nov 01 '23 07:11 zhans1099

model = AutoModel.from_pretrained("D:\ChatGLM\model\2", trust_remote_code=True).cuda() 没有报错直接退出

Nov 01 '23 07:11 zhans1099

我也遇到这个问题了。请问你解决了吗？你的显卡是？我的是 mx250

Nov 09 '23 13:11 dancruiser

单独下载Q4的模型看看吧。未量化的版本，我在windows下启动完成后显存占用12.5。 mx250 的就不要过来凑热闹用gpu了，跑cpu吧。

Nov 12 '23 14:11 dogvane

ChatGLM2-6B ChatGLM2-6B copied to clipboard

[BUG/Help] model = AutoModel.from_pretrained("D:\\ChatGLM\\model\\2", trust_remote_code=True).cuda() 没有报错直接退出

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

ChatGLM2-6B
ChatGLM2-6B copied to clipboard