llmware icon indicating copy to clipboard operation
llmware copied to clipboard

Unable to load HF model

Open lea11100 opened this issue 1 year ago • 3 comments

Hey,

currently, I receive the error when I want to load a model from HuggingFace:

AttributeError: 'NoneType' object has no attribute 'cuda'

I ran the following code:

prompter = Prompt().load_model("nvidia/Llama3-ChatQA-1.5-8B", temperature=0.0, sample=False, from_hf=True)

lea11100 avatar Jul 04 '24 19:07 lea11100

@lea11100 - let me look into this - will revert back. Sorry you ran into this issue. :)

doberst avatar Jul 04 '24 20:07 doberst

@lea11100 - thanks for raising this - it looks like a bug in the code. We recently shifted to dynamically importing torch only when needed - and in this code path, torch is not getting loaded, and that is creating the error. I get the same error. Will prioritize fixing this - should be merged into the main branch by tomorrow if you pull - or will be in the next pypi release. Will update you once done.

doberst avatar Jul 04 '24 20:07 doberst

@lea11100 - the fix has been merged in the main branch, if you are cloning/pulling from the repo directly. If you prefer pip install, then it will be in llmware=0.3.3 (which should be available by tomorrow). A couple of quick tips:


# Option #1 - load the model from_hf directly into Prompt

prompter = Prompt().load_model("nvidia/LLama3-ChatQA-1.5-8B", temperature=0.0, sample=False, from_hf=True)

# set the prompter wrapper to 'llama_3_chat' after loading the model
prompter.llm_model.prompt_wrapper = "llama_3_chat"

# Option #2 - alternate (and recommended approach) 

from llmware.models import ModelCatalog
from llmware.prompts import Prompt

#  register the hf model in the ModelCatalog 
ModelCatalog().register_new_hf_generative_model("nvidia/Llama3-ChatQA-1.5-8B",
                                                llmware_lookup_name="my_nvidia_llama3",
                                                context_window=8192, prompt_wrapper="llama_3_chat")

# then use the model using only your given short name 
prompter = Prompt().load_model("my_nvidia_llama3")

Hope this solves the issue - please ping back if any ongoing issues! :)

doberst avatar Jul 06 '24 15:07 doberst