Unable to load HF model
Hey,
currently, I receive the error when I want to load a model from HuggingFace:
AttributeError: 'NoneType' object has no attribute 'cuda'
I ran the following code:
prompter = Prompt().load_model("nvidia/Llama3-ChatQA-1.5-8B", temperature=0.0, sample=False, from_hf=True)
@lea11100 - let me look into this - will revert back. Sorry you ran into this issue. :)
@lea11100 - thanks for raising this - it looks like a bug in the code. We recently shifted to dynamically importing torch only when needed - and in this code path, torch is not getting loaded, and that is creating the error. I get the same error. Will prioritize fixing this - should be merged into the main branch by tomorrow if you pull - or will be in the next pypi release. Will update you once done.
@lea11100 - the fix has been merged in the main branch, if you are cloning/pulling from the repo directly. If you prefer pip install, then it will be in llmware=0.3.3 (which should be available by tomorrow). A couple of quick tips:
# Option #1 - load the model from_hf directly into Prompt
prompter = Prompt().load_model("nvidia/LLama3-ChatQA-1.5-8B", temperature=0.0, sample=False, from_hf=True)
# set the prompter wrapper to 'llama_3_chat' after loading the model
prompter.llm_model.prompt_wrapper = "llama_3_chat"
# Option #2 - alternate (and recommended approach)
from llmware.models import ModelCatalog
from llmware.prompts import Prompt
# register the hf model in the ModelCatalog
ModelCatalog().register_new_hf_generative_model("nvidia/Llama3-ChatQA-1.5-8B",
llmware_lookup_name="my_nvidia_llama3",
context_window=8192, prompt_wrapper="llama_3_chat")
# then use the model using only your given short name
prompter = Prompt().load_model("my_nvidia_llama3")
Hope this solves the issue - please ping back if any ongoing issues! :)