guidance
guidance copied to clipboard
Pattern Guides (LLaMA-7B)
Pattern guides/regex patterns don't seem to have quite the impact one would expect
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Neko-Institute-of-Science/LLaMA-7B-HF")
model = AutoModelForCausalLM.from_pretrained("Neko-Institute-of-Science/LLaMA-7B-HF")
llama = guidance.llms.Transformers(model=model, tokenizer=tokenizer, device=5)
statement_gen = guidance("""
Today we want to say that our new tech company with the name {{gen 'companyname' max_tokens=10 pattern='[a-zA-Z]{8}'}} went public under the ticker {{gen 'ticker' pattern='[A-Z]{4}' temperature=0.3}}
""")
statement_gen(llm=llama)
I would expect this to output
Today we want to say that our new tech company with the name NameOfCompany went public under the ticker TCKR
where NameOfCompany
matches [a-zA-Z]{8}
, i.e. consisting of letters, no whitespace and exactly eight characters.
Actual output:
Today we want to say that our new tech company with the name ofTechno 2000 is going went public under the ticker TKOO
, i.e. companyname = ofTechno 2000 is going
(bad) and ticker = TKOO
(good)
Do I misunderstand pattern guides or is this an issue?