guidance Can't use huggingface models

The bug A clear and concise description of what the bug is.

I'm hitting this error across several huggingface models

NameError Traceback (most recent call last) <ipython-input-31-1dd03f6eb8ca> in <cell line: 20>() 18 return '<|end|>' 19 ---> 20 guidance.llm = StarcoderChat(model_path) 21 22 prompt = guidance(''' 2 frames /usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs) 2606 init_contexts = [deepspeed.zero.Init(config_dict_or_path=deepspeed_config())] + init_contexts 2607 elif load_in_8bit or low_cpu_mem_usage: -> 2608 init_contexts.append(init_empty_weights()) 2609 2610 with ContextManagers(init_contexts): NameError: name 'init_empty_weights' is not defined

To Reproduce

Give a full working code snippet that can be pasted into a notebook cell or python file. Make sure to include the LLM load step so we know which model you are using.

# put your code snippet here

import torch from transformers import AutoTokenizer,AutoModelForCausalLM import guidance model_path = "HuggingFaceH4/starchat-alpha"

class StarcoderChat(guidance.llms.Transformers): def init(self, model_path, **kwargs): tokenizer = AutoTokenizer.from_pretrained(model_path, device_map='auto') model = AutoModelForCausalLM.from_pretrained(model_path, device_map='auto', torch_dtype=torch.bfloat16) super().init(model, tokenizer=tokenizer, device_map='auto', **kwargs)

@staticmethod
def role_start(role):
    return "<|"+role+"|>"

@staticmethod
def role_end(role):
    return '<|end|>'

guidance.llm = StarcoderChat(model_path)

prompt = guidance(''' {{#system~}} You are a helpful and terse assistant. {{~/system}} {{#user~}} How do you print something in python? {{~/user}} {{#assistant~}} {{gen 'answer'}} {{~/assistant}}''') prompt()

System info (please complete the following information):

OS (e.g. Ubuntu, Windows 11, Mac OS, etc.):
Guidance Version (guidance.__version__): Google Colab

May 31 '23 22:05 krrishdholakia

It looks like you are using DeepSpeed. While we are working on support for Guidance in DeepSpeed, we are waiting on some DeepSpeed updates before we can support Guidance there.

Jun 01 '23 00:06 slundberg

@slundberg i faced the same issue with just a generic hf model as well - e.g.: TheBloke/wizard-vicuna-13B-GGML

Jun 01 '23 01:06 krrishdholakia

device_map="auto" in the from_pretrained is the culprit here -- there is no error if I remove that argument.

Kindly also note that the error appears on setting device_map to any one of the supported options ("auto", "balanced", "balanced_low_0", "sequential").

from transformers import T5Tokenizer, T5ForConditionalGeneration

model_id = "google/flan-t5-small"

flan_t5_tokenizer = T5Tokenizer.from_pretrained(model_id)

# WORKS: flan_t5_model = T5ForConditionalGeneration.from_pretrained(model_id)
flan_t5_model = T5ForConditionalGeneration.from_pretrained(model_id, device_map="auto")

flan_t5_guidance = guidance.llms.Transformers(model=flan_t5_model, tokenizer=flan_t5_tokenizer)
guidance.llm = flan_t5_guidance

The code above throws the following error:

---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

[<ipython-input-16-715b0ba9d2e1>](https://localhost:8080/#) in <cell line: 6>()
      4 
      5 flan_t5_tokenizer = T5Tokenizer.from_pretrained(model_id)
----> 6 flan_t5_model = T5ForConditionalGeneration.from_pretrained(model_id, device_map="auto")
      7 
      8 flan_t5_guidance = guidance.llms.Transformers(model=flan_t5_model, tokenizer=flan_t5_tokenizer) # device = 0

[/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py](https://localhost:8080/#) in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
   2606             init_contexts = [deepspeed.zero.Init(config_dict_or_path=deepspeed_config())] + init_contexts
   2607         elif load_in_8bit or low_cpu_mem_usage:
-> 2608             init_contexts.append(init_empty_weights())
   2609 
   2610         with ContextManagers(init_contexts):

NameError: name 'init_empty_weights' is not defined

Jun 02 '23 18:06 devautor

device_map='auto' definitely works with Vicuna.

Jun 07 '23 18:06 marcotcr

Hopefully this now works in the new release Please let us know if not :)

Nov 14 '23 21:11 marcotcr