OpenChatKit torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 432.00 MiB (GPU 2; 23.65 GiB total capacity; 20.88 GiB already allocated; 259.56 MiB free;

Describe the bug A clear and concise description of what the bug is.

To Reproduce Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: [e.g. iOS]
Browser [e.g. chrome, safari]
Version [e.g. 22]

Smartphone (please complete the following information):

Device: [e.g. iPhone6]
OS: [e.g. iOS8.1]
Browser [e.g. stock browser, safari]
Version [e.g. 22]

Additional context Add any other context about the problem here.

Mar 16 '23 11:03 wallon-ai

You ran out of GPU memory. Describe more on your setup like what you are using and what command you ran to resolve.

Mar 16 '23 13:03 satpalsr

It'd be really cool if the minimum requirements of the model (size on disk for data set, vram requirements) on the readme, that would save a lot of people some time.

Mar 16 '23 20:03 riatzukiza

You ran out of GPU memory. Describe more on your setup like what you are using and what command you ran to resolve.

f896b19c99f369fd5d354e1be3677a5 batch_size=4

Mar 17 '23 02:03 wallon-ai

It'd be really cool if the minimum requirements of the model (size on disk for data set, vram requirements) on the readme, that would save a lot of people some time.

That's a great idea. I'll put up a PR soon to document this.

Mar 18 '23 05:03 csris

(OpenChatKit) root@aca2869c8358:~/OpenChatKit-main# python inference/bot.py Loading /root/OpenChatKit-main/inference/../huggingface_models/GPT-NeoXT-Chat-Base-20B to cuda:0... Traceback (most recent call last): File "/root/OpenChatKit-main/inference/bot.py", line 185, in main() File "/root/OpenChatKit-main/inference/bot.py", line 181, in main ).cmdloop() File "/root/anaconda3/envs/OpenChatKit/lib/python3.10/cmd.py", line 105, in cmdloop self.preloop() File "/root/OpenChatKit-main/inference/bot.py", line 64, in preloop self._model = ChatModel(self._model_name_or_path, self._gpu_id) File "/root/OpenChatKit-main/inference/bot.py", line 24, in init self._model.to(device) File "/root/anaconda3/envs/OpenChatKit/lib/python3.10/site-packages/torch/nn/modules/module.py", line 989, in to return self._apply(convert) File "/root/anaconda3/envs/OpenChatKit/lib/python3.10/site-packages/torch/nn/modules/module.py", line 641, in _apply module._apply(fn) File "/root/anaconda3/envs/OpenChatKit/lib/python3.10/site-packages/torch/nn/modules/module.py", line 641, in _apply module._apply(fn) File "/root/anaconda3/envs/OpenChatKit/lib/python3.10/site-packages/torch/nn/modules/module.py", line 641, in _apply module._apply(fn) [Previous line repeated 2 more times] File "/root/anaconda3/envs/OpenChatKit/lib/python3.10/site-packages/torch/nn/modules/module.py", line 664, in _apply param_applied = fn(param) File "/root/anaconda3/envs/OpenChatKit/lib/python3.10/site-packages/torch/nn/modules/module.py", line 987, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 288.00 MiB (GPU 0; 14.56 GiB total capacity; 13.86 GiB already allocated; 90.44 MiB free; 13.88 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Mar 18 '23 08:03 musicfish1973

some problem, any idea how much memory it needs? or any solution to reduce the memory use? Thanks.

Mar 20 '23 02:03 bohell