wenda
wenda copied to clipboard
文档对话模式 运行就爆显存了, v100 32G, 几次都不行
Input length of input_ids is 15549, but max_length
is set to 1090. This can lead to unexpected behavior. You should consider increasing max_new_tokens
.
错误 CUDA out of memory. Tried to allocate 14.41 GiB (GPU 0; 31.75 GiB total capacity; 27.78 GiB already allocated; 3.01 GiB free; 27.87 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
+1, chatpgpt4 and chatglm2-6b support 32k tokens but wenda's setting is between 0 and 4096, the the default just 2048
+1, chatpgpt4 and chatglm2-6b support 32k tokens but wenda's setting is between 0 and 4096, the the default just 2048
https://github.com/wenda-LLM/wenda/commit/f6c29f77312ab1093344d8ad85e0659d0b788281
+1, chatpgpt4 and chatglm2-6b support 32k tokens but wenda's setting is between 0 and 4096, the the default just 2048
token limit can be set max to 10k,please check of your setting。
说一下我的环境,仅供参考。 4090卡,chatglm2-6b fp16.(加载模型后占用约15GB) 文档大小14m pdf。50页(对话占用21.5GB,可以正常对话,max token拉到10000以上,防止不出结果)