wenda 文档对话模式运行就爆显存了， v100 32G，几次都不行

文档对话模式运行就爆显存了， v100 32G，几次都不行

Open alexhmyang opened this issue 1 year ago • 4 comments

Input length of input_ids is 15549, but max_length is set to 1090. This can lead to unexpected behavior. You should consider increasing max_new_tokens. 错误 CUDA out of memory. Tried to allocate 14.41 GiB (GPU 0; 31.75 GiB total capacity; 27.78 GiB already allocated; 3.01 GiB free; 27.87 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Jun 08 '23 00:06 alexhmyang

+1, chatpgpt4 and chatglm2-6b support 32k tokens but wenda's setting is between 0 and 4096, the the default just 2048

Jul 11 '23 01:07 pengber

+1, chatpgpt4 and chatglm2-6b support 32k tokens but wenda's setting is between 0 and 4096, the the default just 2048

https://github.com/wenda-LLM/wenda/commit/f6c29f77312ab1093344d8ad85e0659d0b788281

Jul 11 '23 07:07 l15y

+1, chatpgpt4 and chatglm2-6b support 32k tokens but wenda's setting is between 0 and 4096, the the default just 2048

token limit can be set max to 10k，please check of your setting。

Jul 25 '23 01:07 xubinxinant

说一下我的环境，仅供参考。 4090卡，chatglm2-6b fp16.（加载模型后占用约15GB）文档大小14m pdf。50页（对话占用21.5GB，可以正常对话，max token拉到10000以上，防止不出结果）

Jul 25 '23 01:07 xubinxinant

wenda wenda copied to clipboard

文档对话模式 运行就爆显存了， v100 32G， 几次都不行

wenda
wenda copied to clipboard

文档对话模式运行就爆显存了， v100 32G，几次都不行