AI-Scientist icon indicating copy to clipboard operation
AI-Scientist copied to clipboard

extract_json_between_markers doesn't handle responses missing json markers

Open robotdad opened this issue 1 year ago • 5 comments

I found when using this with Azure OpenAI that the llm responses were coming back as valid json but extract_json_between_markers was rejecting them because it is enforcing a check for the json marker which was not in the response.

I can submit a PR to add a fallback to try parsing the output as json without those markers if you want it.

robotdad avatar Aug 21 '24 00:08 robotdad

Hi! Sure thing if you think there would be a measurable improvement to success rate, from our observations this could happen but unclear whether this happens that often or more than other types of JSON formatting issues!

conglu1997 avatar Aug 21 '24 15:08 conglu1997

I encountered raw json running perform_review with gpt4o on Azure. The change in the pr worked there, but I have not run this against other models.

robotdad avatar Aug 21 '24 16:08 robotdad

Interesting - this is quite the pathological case since we ask for chain of thought steps as well as the JSON. We didn’t observe failures in formatting more than perhaps 1% of the time, are you observing higher?

conglu1997 avatar Aug 21 '24 18:08 conglu1997

I only ran the perform_review with a paper using gpt4o on Azure. It was always returned as well formatted json but the method was returning none because it did not have the json markers. With this fallback mechanism it is returned as valid json.

robotdad avatar Aug 21 '24 20:08 robotdad

I used strictjson to reformat the responses, since I'm running locally, token is no longer an issue.

jli113 avatar Sep 02 '24 01:09 jli113