merlin icon indicating copy to clipboard operation
merlin copied to clipboard

To use vicuna model as text decoder

Open minjung98 opened this issue 1 year ago • 2 comments

Hello,

Your project seems really interesting. I have a question regarding the execution of sh playground/merlin/clip-large+conv+vicuna-v15-7b/pretrain.sh. In the file, it says --model_name_or_path /path/models--lmsys--vicuna-7b-v15 \. If I want to use lmsys/vicuna-7b-v15 as the text decoder, do I need to download the models manually, place them in a specific path, and modify the path accordingly? Should I download all the files and place them in a specific path as shown in the picture below? 스크린샷 2024-06-25 오전 1 23 45 I would appreciate it if you could provide a guide on how to set up the vicuna-7b-v15 model.

And could you please let me know the required CUDA version to run this file? I encountered an error stating that the libcudart.so.12 file is missing, so I set up the environment with CUDA 12.0. However, I got an error indicating that the version does not match with flash-attn. When I switched to CUDA 11.7, I again encountered the missing libcudart.so.12 file error.

Thank you.

minjung98 avatar Jun 24 '24 16:06 minjung98

Thanks for your attention. Yeah, you need to download the whole files of vicuna-v15. CUDA version is cuda 11.8.

Ahnsun avatar Jun 26 '24 08:06 Ahnsun

Thank you for your kind response. I'll try again with cuda 11.8.

minjung98 avatar Jun 26 '24 08:06 minjung98