Video-ChatGPT icon indicating copy to clipboard operation
Video-ChatGPT copied to clipboard

How to download the ready LLaVA-Lightening-7B weights

Open SIGMIND opened this issue 10 months ago • 5 comments

As mentioned on the offline demo readme, Alternatively you can download the ready LLaVA-Lightening-7B weights from mmaaz60/LLaVA-Lightening-7B-v1-1. THe Hugging Face repo has files named pytorch_model-00001-of-00002.bin and pytorch_model-00002-of-00002.bin Should I convert the model to gguf format to be used for offline demo?

SIGMIND avatar Apr 17 '24 17:04 SIGMIND

Hi @SIGMIND,

No conversion is required, you can directly clone it from huggingface as below,

git lfs install
git clone https://huggingface.co/mmaaz60/LLaVA-7B-Lightening-v1-1

Then, download projection weights as

git clone https://huggingface.co/MBZUAI/Video-ChatGPT-7B

Finally you should be able to run the demo as,

python video_chatgpt/demo/video_demo.py 
        --model-name LLaVA-7B-Lightening-v1-1 \
        --projection_path Video-ChatGPT-7B/video_chatgpt-7B.bin

I hope it will help. Let me know if you have any questions. Thanks

mmaaz60 avatar Apr 17 '24 18:04 mmaaz60

Thanks, the steps helped moving forward with the models. However, is there any specific GPU requirement specification for running this locally? I have tried to run it on RTX 2060 but getting error as bellow: python video_chatgpt/demo/video_demo.py 2024-04-18 17:52:45 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, model_name='LLaVA-7B-Lightening-v1-1', vision_tower_name='openai/clip-vit-large-patch14', conv_mode='video-chatgpt_v1', projection_path='/mnt/sdc1/Video-ChatGPT/Video-ChatGPT-7B/video_chatgpt-7B.bin') 2024-04-18 17:52:45 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, model_name='LLaVA-7B-Lightening-v1-1', vision_tower_name='openai/clip-vit-large-patch14', conv_mode='video-chatgpt_v1', projection_path='/mnt/sdc1/Video-ChatGPT/Video-ChatGPT-7B/video_chatgpt-7B.bin') You are using a model of type llava to instantiate a model of type VideoChatGPT. This is not supported for all configurations of models and can yield errors. Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 50%|██████████████████████████████████████████████████████████████████████ | 1/2 [05:36<05:36, 336.88s/it] Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [08:46<00:00, 250.17s/it] Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [08:46<00:00, 263.40s/it] 2024-04-18 18:01:32 | ERROR | stderr | preprocessor_config.json: 0%| | 0.00/316 [00:00<?, ?B/s] preprocessor_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 316/316 [00:00<00:00, 1.74MB/s] 2024-04-18 18:01:43 | ERROR | stderr | 2024-04-18 18:01:57 | INFO | stdout | Loading weights from /mnt/sdc1/Video-ChatGPT/Video-ChatGPT-7B/video_chatgpt-7B.bin 2024-04-18 18:02:24 | INFO | stdout | Weights loaded from /mnt/sdc1/Video-ChatGPT/Video-ChatGPT-7B/video_chatgpt-7B.bin 2024-04-18 18:02:24 | ERROR | stderr | Traceback (most recent call last): 2024-04-18 18:02:24 | ERROR | stderr | File "/mnt/sdc1/Video-ChatGPT/video_chatgpt/demo/video_demo.py", line 264, in <module> 2024-04-18 18:02:24 | ERROR | stderr | initialize_model(args.model_name, args.projection_path) 2024-04-18 18:02:24 | ERROR | stderr | File "/mnt/sdc1/Video-ChatGPT/video_chatgpt/eval/model_utils.py", line 131, in initialize_model 2024-04-18 18:02:24 | ERROR | stderr | model = model.cuda() 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 905, in cuda 2024-04-18 18:02:24 | ERROR | stderr | return self._apply(lambda t: t.cuda(device)) 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply 2024-04-18 18:02:24 | ERROR | stderr | module._apply(fn) 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 797, in _apply 2024-04-18 18:02:24 | ERROR | stderr | module._apply(fn) 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 820, in _apply 2024-04-18 18:02:24 | ERROR | stderr | param_applied = fn(param) 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/nn/modules/module.py", line 905, in <lambda> 2024-04-18 18:02:24 | ERROR | stderr | return self._apply(lambda t: t.cuda(device)) 2024-04-18 18:02:24 | ERROR | stderr | File "/home/sig/Downloads/[/mnt/DockerRuntime/miniconda]/envs/video_chatgpt/lib/python3.10/site-packages/torch/cuda/__init__.py", line 247, in _lazy_init 2024-04-18 18:02:24 | ERROR | stderr | torch._C._cuda_init() 2024-04-18 18:02:24 | ERROR | stderr | RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW

SIGMIND avatar Apr 18 '24 12:04 SIGMIND

It seems that last sentence from your logs indicate a driver issue.

biphobe avatar Apr 25 '24 23:04 biphobe

Understood and that is resolved. But how much GPU memory is required to run it offline? I have 12 GB RTX 2060 and getting this error

2024-04-28 15:27:44 | ERROR | stderr |     return self._apply(lambda t: t.cuda(device))
2024-04-28 15:27:44 | ERROR | stderr | torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB (GPU 0; 11.73 GiB total capacity; 11.26 GiB already allocated; 26.88 MiB free; 11.27 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

SIGMIND avatar Apr 28 '24 09:04 SIGMIND

I've run the model locally on RTX 2070 SUPER successfully and also I've also run the model in the cloud with no issues.

Your problem seems related to your setup. Try closing every app on your system and then run the model. In my case during my initial local attempt, the browser was reserving GPU memory and caused errors you just mentioned.

biphobe avatar May 01 '24 23:05 biphobe