Can't use huggingface models
The bug A clear and concise description of what the bug is.
I'm hitting this error across several huggingface models
NameError Traceback (most recent call last) <ipython-input-31-1dd03f6eb8ca> in <cell line: 20>() 18 return '<|end|>' 19 ---> 20 guidance.llm = StarcoderChat(model_path) 21 22 prompt = guidance(''' 2 frames /usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs) 2606 init_contexts = [deepspeed.zero.Init(config_dict_or_path=deepspeed_config())] + init_contexts 2607 elif load_in_8bit or low_cpu_mem_usage: -> 2608 init_contexts.append(init_empty_weights()) 2609 2610 with ContextManagers(init_contexts): NameError: name 'init_empty_weights' is not defined
To Reproduce
Give a full working code snippet that can be pasted into a notebook cell or python file. Make sure to include the LLM load step so we know which model you are using.
# put your code snippet here
import torch from transformers import AutoTokenizer,AutoModelForCausalLM import guidance model_path = "HuggingFaceH4/starchat-alpha"
class StarcoderChat(guidance.llms.Transformers): def init(self, model_path, **kwargs): tokenizer = AutoTokenizer.from_pretrained(model_path, device_map='auto') model = AutoModelForCausalLM.from_pretrained(model_path, device_map='auto', torch_dtype=torch.bfloat16) super().init(model, tokenizer=tokenizer, device_map='auto', **kwargs)
@staticmethod
def role_start(role):
return "<|"+role+"|>"
@staticmethod
def role_end(role):
return '<|end|>'
guidance.llm = StarcoderChat(model_path)
prompt = guidance(''' {{#system~}} You are a helpful and terse assistant. {{~/system}} {{#user~}} How do you print something in python? {{~/user}} {{#assistant~}} {{gen 'answer'}} {{~/assistant}}''') prompt()
System info (please complete the following information):
- OS (e.g. Ubuntu, Windows 11, Mac OS, etc.):
- Guidance Version (
guidance.__version__): Google Colab
It looks like you are using DeepSpeed. While we are working on support for Guidance in DeepSpeed, we are waiting on some DeepSpeed updates before we can support Guidance there.
@slundberg
i faced the same issue with just a generic hf model as well - e.g.: TheBloke/wizard-vicuna-13B-GGML
device_map="auto" in the from_pretrained is the culprit here -- there is no error if I remove that argument.
Kindly also note that the error appears on setting device_map to any one of the supported options ("auto", "balanced", "balanced_low_0", "sequential").
from transformers import T5Tokenizer, T5ForConditionalGeneration
model_id = "google/flan-t5-small"
flan_t5_tokenizer = T5Tokenizer.from_pretrained(model_id)
# WORKS: flan_t5_model = T5ForConditionalGeneration.from_pretrained(model_id)
flan_t5_model = T5ForConditionalGeneration.from_pretrained(model_id, device_map="auto")
flan_t5_guidance = guidance.llms.Transformers(model=flan_t5_model, tokenizer=flan_t5_tokenizer)
guidance.llm = flan_t5_guidance
The code above throws the following error:
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
[<ipython-input-16-715b0ba9d2e1>](https://localhost:8080/#) in <cell line: 6>()
4
5 flan_t5_tokenizer = T5Tokenizer.from_pretrained(model_id)
----> 6 flan_t5_model = T5ForConditionalGeneration.from_pretrained(model_id, device_map="auto")
7
8 flan_t5_guidance = guidance.llms.Transformers(model=flan_t5_model, tokenizer=flan_t5_tokenizer) # device = 0
[/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py](https://localhost:8080/#) in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
2606 init_contexts = [deepspeed.zero.Init(config_dict_or_path=deepspeed_config())] + init_contexts
2607 elif load_in_8bit or low_cpu_mem_usage:
-> 2608 init_contexts.append(init_empty_weights())
2609
2610 with ContextManagers(init_contexts):
NameError: name 'init_empty_weights' is not defined
device_map='auto' definitely works with Vicuna.
Hopefully this now works in the new release Please let us know if not :)