mergekit icon indicating copy to clipboard operation
mergekit copied to clipboard

Keep getting an error when making a mixtral model

Open A500000 opened this issue 1 year ago • 5 comments

whenever I make a mistral model using two llama2 13b models, I get the following error message: Traceback (most recent call last): File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code exec(code, run_globals) File "H:\mergekit\Scripts\mergekit-moe.exe_main.py", line 7, in File "H:\mergekit\lib\site-packages\click\core.py", line 1157, in call return self.main(*args, **kwargs) File "H:\mergekit\lib\site-packages\click\core.py", line 1078, in main rv = self.invoke(ctx) File "H:\mergekit\lib\site-packages\click\core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File "H:\mergekit\lib\site-packages\click\core.py", line 783, in invoke return __callback(*args, **kwargs) File "H:\mergekit\mergekit\options.py", line 59, in wrapper f(*args, **kwargs) File "H:\mergekit\mergekit\scripts\mixtral_moe.py", line 395, in main build( File "H:\mergekit\mergekit\scripts\mixtral_moe.py", line 318, in build gate_vecs = get_gate_params( File "H:\mergekit\mergekit\scripts\mixtral_moe.py", line 135, in get_gate_params model = AutoModelForCausalLM.from_pretrained( File "H:\mergekit\lib\site-packages\transformers\models\auto\auto_factory.py", line 566, in from_pretrained return model_class.from_pretrained( File "H:\mergekit\lib\site-packages\transformers\modeling_utils.py", line 3706, in from_pretrained ) = cls._load_pretrained_model( File "H:\mergekit\lib\site-packages\transformers\modeling_utils.py", line 3836, in _load_pretrained_model raise ValueError( ValueError: The current device_map had weights offloaded to the disk. Please provide an offload_folder for them. Alternatively, make sure you have safetensors installed if the model you are using offers the weights in this format. I have tried to edit modeling_utils.py to no avail. Does anyone know how I can fix this?

A500000 avatar Jan 17 '24 23:01 A500000

I searched online, and it seems that the issue may be due to the large size of the model you're loading, and your memory may not be sufficient to support it. Perhaps you can try adding the parameter "offload_folder='offload_folder'" to the "AutoModelForCausalLM.from_pretrained" function.

I can provide some websites about this problem with you, but these language is Chinese. https://zhuanlan.zhihu.com/p/647755430 https://zhuanlan.zhihu.com/p/605640431 https://blog.csdn.net/qq_40302568/article/details/135028085

ZhangEnmao avatar Jan 18 '24 02:01 ZhangEnmao

wich file is "AutoModelForCausalLM.from_pretrained" in?

A500000 avatar Jan 18 '24 03:01 A500000

image in mixtral_moe.py, bro

ZhangEnmao avatar Jan 18 '24 03:01 ZhangEnmao

thanks

A500000 avatar Jan 18 '24 03:01 A500000

This error is because you don't have enough VRAM to inference the base model. There are a couple of things you can do to try to get around it. Adding offload_folder='offload_folder' might work. I'd recommend trying the --load-in-8bit or --load-in-4bit flags. You also might be able to use --device cpu but that will be very slow.

cg123 avatar Jan 18 '24 22:01 cg123