outlines
outlines copied to clipboard
llama_cpp - JSON fails to generate when using Pydantic model with models.llama_cpp
Describe the issue as clearly as possible:
When using models.llamacpp and creating JSON using a Pydantic model I get an error when generating the first result (see code to reproduce below). I have runt his code using models.transformers with no issue.
The model I'm using in this example is taken directly from the Chain of Thought cook book example, but I have also tried others and had the same issue.
Steps/code to reproduce the bug:
import llama_cpp
from outlines import generate, models
from textwrap import dedent
llama_tokenizer = llama_cpp.llama_tokenizer.LlamaHFTokenizer.from_pretrained(
"NousResearch/Hermes-2-Pro-Llama-3-8B"
)
tokenizer = llama_tokenizer.hf_tokenizer
model = models.llamacpp("NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF",
"Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf",
tokenizer=llama_tokenizer,
n_gpu_layers=-1,
flash_attn=True,
n_ctx=8192,
verbose=False)
complaint_data = [{'message': 'Hi, my name is Olivia Brown.I recently ordered a knife set from your wellness range, and it arrived earlier this week. Unfortunately, my satisfaction with the product has been less than ideal.My order was A123456',
'order_number': 'A12-3456',
'department': 'kitchen'},
{'message': 'Hi, my name is John Smith.I recently ordered a dress for an upcoming event, which was alleged to meet my expectations both in fit and style. However, upon arrival, it became apparent that the fabric was of subpar quality, leading to a less than satisfactory appearance.The order number is A12-3456',
'order_number': 'A12-3456',
'department': 'clothing'},
{'message': 'Hi, my name is Sarah Johnson.I recently ordered the ultimate ChefMaster 8 Drawer Cooktop. However, upon delivery, I discovered that one of the burners is malfunctioning.My order was A458739',
'order_number': 'A45-8739',
'department': 'kitchen'}]
from pydantic import BaseModel, Field, constr
from enum import Enum
class Department(str, Enum):
clothing = "clothing"
electronics = "electronics"
kitchen = "kitchen"
automotive = "automotive"
class ComplaintData(BaseModel):
first_name: str
last_name: str
order_number: str = Field(pattern=r'[ADZ][0-9]{2}-[0-9]{4}')
department: Department
def create_prompt(complaint):
complaint_messages = [
{
'role': 'user',
'content': f"""
You are a complaint processing assistent, you aim is to process complaints and return the following intformation in this JSON format:
{{
'first_name': <first name>,
'last_name': <last name>,
'order number': <order number has the following format (ADZ)XX-XXXXX>,
'department': <{"|".join([e.value for e in Department])}>,
}}
"""},
{'role': 'assistant',
'content': "I undersand and will process the complaints in the JSON format you described"
},
{'role': 'user',
'content': complaint['message']
}
]
complaint_prompt = tokenizer.apply_chat_template(complaint_messages, tokenize=False)
return complaint_prompt
if __name__ == "__main__":
complaint_processor = generate.json(model, ComplaintData)
results = []
for complaint in complaint_data[0:10]:
prompt = create_prompt(complaint)
result = complaint_processor(prompt)
print(result)
Expected result:
JSON represented by the Pydantic model.
Error message:
File "/Users/will/.venv/dev/lib/python3.11/site-packages/pydantic/main.py", line 1160, in parse_raw
obj = parse.load_str_bytes(
^^^^^^^^^^^^^^^^^^^^^
File "/Users/will/.venv/dev/lib/python3.11/site-packages/pydantic/deprecated/parse.py", line 49, in load_str_bytes
return json_loads(b) # type: ignore
^^^^^^^^^^^^^
File "/Users/will/.pyenv/versions/3.11.0/lib/python3.11/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/will/.pyenv/versions/3.11.0/lib/python3.11/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/will/.pyenv/versions/3.11.0/lib/python3.11/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 48 (char 47)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/will/code/notes/llama_json_bug.py", line 74, in <module>
result = complaint_processor(prompt)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/will/.venv/dev/lib/python3.11/site-packages/outlines/generate/api.py", line 511, in __call__
return format(completions)
^^^^^^^^^^^^^^^^^^^
File "/Users/will/.venv/dev/lib/python3.11/site-packages/outlines/generate/api.py", line 497, in format
return self.format_sequence(sequences)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/will/.venv/dev/lib/python3.11/site-packages/outlines/generate/json.py", line 50, in <lambda>
generator.format_sequence = lambda x: schema_object.parse_raw(x)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/will/.venv/dev/lib/python3.11/site-packages/pydantic/main.py", line 1187, in parse_raw
raise pydantic_core.ValidationError.from_exception_data(cls.__name__, [error])
pydantic_core._pydantic_core.ValidationError: 1 validation error for ComplaintData
__root__
Unterminated string starting at: line 1 column 48 (char 47) [type=value_error.jsondecode, input_value='{"first_name": "Olivia", "last_name": "Brown", "', input_type=str]
Outlines/Python version information:
Version information
0.0.46 Python 3.11.0 (main, Jul 6 2024, 12:54:41) [Clang 15.0.0 (clang-1500.3.9.4)] aiohappyeyeballs==2.4.0 aiohttp==3.10.5 aiosignal==1.3.1 annotated-types==0.7.0 attrs==24.2.0 certifi==2024.7.4 charset-normalizer==3.3.2 cloudpickle==3.0.0 datasets==2.21.0 dill==0.3.8 diskcache==5.6.3 filelock==3.15.4 frozenlist==1.4.1 fsspec==2024.6.1 huggingface-hub==0.24.6 idna==3.7 interegular==0.3.3 Jinja2==3.1.4 jsonschema==4.23.0 jsonschema-specifications==2023.12.1 lark==1.2.2 llama_cpp_python==0.2.89 llvmlite==0.43.0 MarkupSafe==2.1.5 mpmath==1.3.0 multidict==6.0.5 multiprocess==0.70.16 nest-asyncio==1.6.0 networkx==3.3 numba==0.60.0 numpy==1.26.4 outlines==0.0.46 packaging==24.1 pandas==2.2.2 pyairports==2.1.1 pyarrow==17.0.0 pycountry==24.6.1 pydantic==2.8.2 pydantic_core==2.20.1 python-dateutil==2.9.0.post0 pytz==2024.1 PyYAML==6.0.2 referencing==0.35.1 regex==2024.7.24 requests==2.32.3 rpds-py==0.20.0 safetensors==0.4.4 six==1.16.0 sympy==1.13.2 tokenizers==0.19.1 torch==2.4.0 tqdm==4.66.5 transformers==4.44.1 typing_extensions==4.12.2 tzdata==2024.1 urllib3==2.2.2 xxhash==3.5.0 yarl==1.9.4
Context for the issue:
This issue came up while working on an ODSC workshop covering outlines. I ended up going with transformers instead of llama_cpp.
This fix the problem. (Or at least in my case)
For those who would like to test in advance:
pip install git+https://github.com/lapp0/outlines.git@fix-json --force-reinstall
Thanks for directing people to that branch!
Still a work in progress, only a subset of the failure cases are handled right now. Happy to hear more json failure reports to help me ensure I address all problems!
@lapp0 since yesterday I've noticed other errors despite the fix already commited. Whenever I come across an error, I'll post it here.
Traceback (most recent call last):
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/pydantic/main.py", line 1160, in parse_raw
obj = parse.load_str_bytes(
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/pydantic/deprecated/parse.py", line 49, in load_str_bytes
return json_loads(b) # type: ignore
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 1 column 3328 (char 3327)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/mnt/home/pierre/llmdoc/outlines_server/outlines_server/DocumentClassifier.py", line 131, in <module>
File "/mnt/home/pierre/llmdoc/outlines_server/outlines_server/DocumentClassifier.py", line 92, in classify
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/outlines/generate/api.py", line 511, in __call__
return self._format(completions)
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/outlines/generate/api.py", line 487, in _format
return self.format_sequence(sequences)
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/outlines/generate/json.py", line 50, in <lambda>
generator.format_sequence = lambda x: schema_object.parse_raw(x)
File "/mnt/home/pierre/miniconda3/envs/llmdoc/lib/python3.10/site-packages/pydantic/main.py", line 1187, in parse_raw
raise pydantic_core.ValidationError.from_exception_data(cls.__name__, [error])
pydantic_core._pydantic_core.ValidationError: 1 validation error for DocumentClassificationResult
__root__
Expecting ',' delimiter: line 1 column 3328 (char 3327) [type=value_error.jsondecode, input_value='{ "label": "tax_notice",...99999999999999999999999', input_type=str]
Anything I can help with here? Currently getting bit by this issue, using the dev branch.
import outlines
import llama_cpp
# Load the model
model = outlines.models.llamacpp(
"NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF",
"Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf",
tokenizer=llama_cpp.llama_tokenizer.LlamaHFTokenizer.from_pretrained(
"NousResearch/Hermes-2-Pro-Llama-3-8B"
),
n_gpu_layers=-1,
n_ctx=8192,
verbose=False
)
# Specify temperature in sampler
sampler = outlines.samplers.multinomial(samples=100)
# Define the coinflip choice
coinflip_regex_pattern = r"[H|T]"
generator = outlines.generate.choice(
model,
["H", "T"],
sampler=sampler
)
output = generator("Flip a coin, respond with H or T: ")
print(output)
# Count the occurrences of each outcome
heads_count = output.count("H")
tails_count = output.count("T")
print(f"Heads: {heads_count}, Tails: {tails_count}")
Versions:
@cpfiffer Outlines' llama.cpp integration doesn't support multiple samples at once. Could you try again with samples=1 (or just don't explicitly set the sampler)
generator = outlines.generate.choice(
model,
["H", "T"],
)
If there's a problem with generate.choice could you also open a separate issue?
@PierreCarceller Can you please share your schema?
The original issue here seems to be resolved by the fix-json branch, except is re-uses logits processors. A new logits processor must be created for each run. This is a separate bug in generate.json / outlines.processors which should be addressed as part of this issue.
I believe the choice issue is in #1109.
Can't believe I missed the multiple samples error,
@cpfiffer Outlines' llama.cpp integration doesn't support multiple samples at once. Could you try again with samples=1 (or just don't explicitly set the sampler)
works.