llm-analysis icon indicating copy to clipboard operation
llm-analysis copied to clipboard

mistral and mixtral inference[BUG]

Open Akash08naik opened this issue 1 year ago • 4 comments
trafficstars

Describe the bugMistral and Mixtral models not able to infer When i give the name of the model as i do for other models in case of mistral there is a key error from the configuration_auto.py file in llm_analysis module. This is because there is no key with mistral in the config_map.

So could you also add all the models from hugging face which are not yet defined!!

Akash08naik avatar Dec 28 '23 11:12 Akash08naik

Please try updating the transformers library,pip install -U transformers. If not using a local model json file as the model_config, llm-analysis will rely on transformers to find the corresponding model configuration on the hf hub, meaning information of newer models only exist after certain version of the transformers library.

cli99 avatar Jan 01 '24 22:01 cli99

yeah that worked . And also could you tell how to use this for big models with more parameters . like in my case when i try to load a 34B billion parameter the error i get is AssertionError. assert memory_left>0 . The model is too large to fit in total GPU memory.

Akash08naik avatar Jan 03 '24 08:01 Akash08naik

there are multiple ways to address OOM: use GPUs with larger memory (e.g. 80gb); use more GPUs and apply tensor parallelism or expert parallelism (if it's a moe model); use quantization (use different dtype rather than the default "w16a16e16").

I will add an mistral example soon

cli99 avatar Jan 03 '24 20:01 cli99

I have made changes in the code in place of memory and have given more space. Now the model is running . But will this change alter the results from normal?

Akash08naik avatar Jan 03 '24 22:01 Akash08naik