llama_index
llama_index copied to clipboard
[Bug]: OutputParserException: Failed to parse pydantic object from guidance program
Bug Description
@Harsha-Nori:
When I execute the code below taken from the official docs.
from pydantic import BaseModel
from typing import List
from guidance.llms import OpenAI
from llama_index.program import GuidancePydanticProgram
class Song(BaseModel):
title: str
length_seconds: int
class Album(BaseModel):
name: str
artist: str
songs: List[Song]
program = GuidancePydanticProgram(
output_cls=Album,
prompt_template_str=(
"Generate an example album, with an artist and a list of songs. Using"
" the movie {{movie_name}} as inspiration"
),
guidance_llm=OpenAI(
api_type=str(openai.api_type),
api_key=str(openai.api_key),
api_base=str(openai.base_url),
api_version=str(openai.api_version),
model='gpt-35-turbo',
deployment_id=model_name,
),
verbose=True,
)
# print("program.guidance_llm: ", program)
output = program(movie_name="The Shining")
I get this log:
Generate an example album, with an artist and a list of songs. Using the movie {{movie_name}} as inspiration
`` `json
{
"name": "{{gen 'name' stop='"'}}",
"artist": "{{gen 'artist' stop='"'}}",
"songs": [{{#geneach 'songs' stop=']'}}{{#unless @first}}, {{/unless}}{
"title": "{{gen 'title' stop='"'}}",
"length_seconds": "{{gen 'length_seconds' stop='"'}}",
}{{/geneach}}],
}
` ``
ParserError: while parsing a flow mapping
in "<unicode string>", line 1, column 1:
{
^
expected ',' or '}', but got '<scalar>'
in "<unicode string>", line 2, column 31:
"name": "{gen 'name' stop='"'}",
^
During handling of the above exception, another exception occurred:
OutputParserException Traceback (most recent call last)
File ~/dev/dir/project/.venv/lib/python3.9/site-packages/llama_index/prompts/guidance_utils.py:149, in parse_pydantic_from_guidance_program(program, cls, verbose)
148 print(output)
--> 149 json_dict = parse_json_markdown(output)
150 sub_questions = cls.parse_obj(json_dict)
File ~/dev/dir/project/.venv/lib/python3.9/site-packages/llama_index/output_parsers/utils.py:54, in parse_json_markdown(text)
53 except yaml.YAMLError as e_yaml:
---> 54 raise OutputParserException(
55 f"Got invalid JSON object. Error: {e_json} {e_yaml}. "
56 f"Got JSON string: {json_string}"
57 )
58 except NameError as exc:
OutputParserException: Got invalid JSON object. Error: Expecting ',' delimiter: line 2 column 31 (char 32) while parsing a flow mapping
in "<unicode string>", line 1, column 1:
{
^
expected ',' or '}', but got '<scalar>'
in "<unicode string>", line 2, column 31:
"name": "{gen 'name' stop='"'}",
^. Got JSON string: {
"name": "{gen 'name' stop='"'}",
"artist": "{gen 'artist' stop='"'}",
"songs": [{#geneach 'songs' stop=']'}{#unless @first}, {/unless}{
"title": "{gen 'title' stop='"'}",
"length_seconds": "{gen 'length_seconds' stop='"'}",
}{/geneach}],
}
The above exception was the direct cause of the following exception:
OutputParserException Traceback (most recent call last)
Cell In[27], line 35
17 program = GuidancePydanticProgram(
18 output_cls=Album,
19 prompt_template_str=(
(...)
31 verbose=True,
32 )
34 # print("program.guidance_llm: ", program)
---> 35 output = program(movie_name="The Shining")
File ~/dev/dir/project/.venv/lib/python3.9/site-packages/llama_index/program/guidance_program.py:80, in GuidancePydanticProgram.__call__(self, *args, **kwargs)
73 def __call__(
74 self,
75 *args: Any,
76 **kwargs: Any,
77 ) -> BaseModel:
78 executed_program = self._guidance_program(**kwargs)
---> 80 return parse_pydantic_from_guidance_program(
81 program=executed_program, cls=self._output_cls
82 )
File ~/dev/dir/project/.venv/lib/python3.9/site-packages/llama_index/prompts/guidance_utils.py:152, in parse_pydantic_from_guidance_program(program, cls, verbose)
150 sub_questions = cls.parse_obj(json_dict)
151 except Exception as e:
--> 152 raise OutputParserException(
153 "Failed to parse pydantic object from guidance program"
154 ) from e
155 return sub_questions
OutputParserException: Failed to parse pydantic object from guidance program
Version
Version: 0.9.13
Steps to Reproduce
Execute the code above in a notebook (python 3.9.18)
Relevant Logs/Tracbacks
cf. above
🤖
Hey @younes-io, good to see you again! Thanks for presenting this detailed issue. I'm diving into it and will have a full response for you shortly. Hang tight!
FYI I use
Name: guidance Version: 0.0.64
Did you mention to fix this?
I got this with both stabilityai/stablelm-zephyr-3b and zephyr 7b
🤖
Based on the error message you provided, it seems like the issue is with the JSON object that is being parsed. The error message indicates that there is an issue with the JSON object at line 2, column 31. The error message suggests that a ',' or '}' was expected, but a '
Looking at the JSON object in your error message:
{
"name": "{gen 'name' stop='"'}",
"artist": "{gen 'artist' stop='"'}",
"songs": [{#geneach 'songs' stop=']'}{#unless @first}, {/unless}{
"title": "{gen 'title' stop='"'}",
"length_seconds": "{gen 'length_seconds' stop='"'}",
}{/geneach}],
}
It seems like the issue might be with the {gen 'name' stop='"'}
and {gen 'artist' stop='"'}
parts of the JSON object. The {gen ...}
syntax seems to be a placeholder for generating some value, but it's not valid JSON syntax. This could be why the parser is failing.
The parse_json_markdown
function in llama_index/output_parsers/utils.py
is designed to parse a JSON object embedded within a markdown text block. The expected format of the JSON object is a valid JSON object or array that can be parsed by Python's json.loads
function or PyYAML's yaml.safe_load
function.
If the JSON object or array cannot be parsed by json.loads
, the function tries to parse it with yaml.safe_load
, which is less strict and allows for trailing commas. If this also fails, it raises an OutputParserException
.
So, the expected format of the JSON object is a valid JSON or YAML syntax that can be parsed by Python's json.loads
or PyYAML's yaml.safe_load
function. It should be embedded within a markdown text block, surrounded by "json" and "
".
You can find more details about this in the llama_index/output_parsers/utils.py file.
To resolve this issue, you might need to ensure that the JSON object being passed to the parse_json_markdown
function is a valid JSON object. If the {gen ...}
syntax is a placeholder, you might need to replace it with a valid value before passing it to the function.
I hope this helps! If you have any more questions or need further clarification, feel free to ask.
Sources
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.