langchain icon indicating copy to clipboard operation
langchain copied to clipboard

fixed json parsing for multiple generated JSON code segments, plus co…

Open rickbraddy-pharma opened this issue 1 year ago • 9 comments

Fixes showstopper issue with parsing LLM-generated code that returns one or multiple code segments in JSON format

Fixes # (issue)

Before submitting

Who can review?

Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:

rickbraddy-pharma avatar May 20 '23 15:05 rickbraddy-pharma

Can someone approve this workflow at least? This is an incredibly important fix. @hwchase17

Kav-K avatar May 21 '23 05:05 Kav-K

I also have many failed runs like #2679 .

My agent sometimes returns:

```\n{\n    \"action\": \"Final Answer\",\n    \"action_input\": \"The latest block count is 2950616.\"\n}\n```

Instead of:

\n{\n    \"action\": \"Final Answer\",\n    \"action_input\": \"The latest block count is 2950616.\"\n}\n

0ptim avatar May 21 '23 12:05 0ptim

I guess the issue with the LLM-generated code is not fixed yet. Using langchain version 0.0.176 the following error occurs when code is returned in triple backticks:

2023-05-22 14:16:40.713 Uncaught app exception Traceback (most recent call last):   File "/home/user1/scratch/conda/envs/lcpy39/lib/python3.9/site-packages/langchain/agents/conversational_chat/output_parser.py", line 21, in parse     cleaned_output, _ = cleaned_output.split("```") ValueError: too many values to unpack (expected 2)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):   File "/home/user1/scratch/conda/envs/lcpy39/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script     exec(code, module.dict)   File "/home/user1/langchain-tools/chatbot_streamlit_chat.py", line 116, in     output, total_tokens, prompt_tokens, completion_tokens = generate_response(   File "/home/user1/langchain-tools/chatbot_streamlit_chat.py", line 96, in generate_response     response = chatbot.agent({"input": query})   File "/home/user1/scratch/conda/envs/lcpy39/lib/python3.9/site-packages/langchain/chains/base.py", line 140, in call     raise e   File "/home/user1/scratch/conda/envs/lcpy39/lib/python3.9/site-packages/langchain/chains/base.py", line 134, in call     self._call(inputs, run_manager=run_manager)   File "/home/user1/scratch/conda/envs/lcpy39/lib/python3.9/site-packages/langchain/agents/agent.py", line 947, in _call     next_step_output = self._take_next_step(   File "/home/user1/scratch/conda/envs/lcpy39/lib/python3.9/site-packages/langchain/agents/agent.py", line 773, in _take_next_step     raise e   File "/home/user1/scratch/conda/envs/lcpy39/lib/python3.9/site-packages/langchain/agents/agent.py", line 762, in _take_next_step     output = self.agent.plan(   File "/home/user1/scratch/conda/envs/lcpy39/lib/python3.9/site-packages/langchain/agents/agent.py", line 444, in plan     return self.output_parser.parse(full_output)   File "/home/user1/scratch/conda/envs/lcpy39/lib/python3.9/site-packages/langchain/agents/conversational_chat/output_parser.py", line 36, in parse     raise OutputParserException(f"Could not parse LLM output: {text}") from e langchain.schema.OutputParserException: Could not parse LLM output: ```json {     "action": "Final Answer",     "action_input": "```python\ndef fibonacci(n):\n    if n <= 0:\n        return []\n    elif n == 1:\n        return [0]\n    elif n == 2:\n        return [0, 1]\n    else:\n        fib_seq = [0, 1]\n        for i in range(2, n):\n            fib_seq.append(fib_seq[-1] + fib_seq[-2])\n        return fib_seq\n\ndef main():\n    fib_numbers = fibonacci(10)\n    print(fib_numbers)\n\nif __name__ == '__main__':\n    main()\n```\n\nThe algorithm used in this script is an iterative approach to calculate the first 10 Fibonacci numbers. It starts with the base cases of n=1 and n=2, returning [0] and [0, 1] respectively. For n > 2, the script initializes the Fibonacci sequence with the first two numbers (0 and 1) and iterates from the third number to the nth number, appending the sum of the previous two numbers in the sequence. The main() function calls the fibonacci() function with the input 10 and prints the resulting list of Fibonacci numbers." } ```

schinto avatar May 22 '23 12:05 schinto

Sorry, I just saw that the fix is not in release 0.0.176 yet.

schinto avatar May 22 '23 12:05 schinto

Can we please approve the workflow - or is there something else needed to proceed? (this is my first pull request, so happy to adjust as appropriate).

rickbraddy-pharma avatar May 22 '23 13:05 rickbraddy-pharma

Resolved Black code formatting issue.

If additional testers are needed, please advise specifics (e.g., which one to modify or a good example to copy/modify)

rickbraddy-pharma avatar May 22 '23 20:05 rickbraddy-pharma

thanks - lets adds some tests for things that this can handle so we don't have any regressions in the future

Happy to do so. Can you point me to existing tester you'd suggest that I update?

rickbraddy-pharma avatar May 22 '23 20:05 rickbraddy-pharma

thanks - lets adds some tests for things that this can handle so we don't have any regressions in the future

Happy to do so. Can you point me to existing tester you'd suggest that I update?

don't think any unit tests exist for this class yet, would make a new file at tests/unit_tests/agents/conversational_chat/test_output_parser.py and add there

dev2049 avatar May 22 '23 21:05 dev2049

I have submitted the fix again, this time addressing the "ruff" code format issues.

I have no idea where to start to develop the unit tester...

rickbraddy-pharma avatar May 23 '23 15:05 rickbraddy-pharma

Figured it out. Will deliver unit tester soon...

rickbraddy-pharma avatar May 23 '23 18:05 rickbraddy-pharma

Okay. Unit tester is now available.

rickbraddy-pharma avatar May 23 '23 19:05 rickbraddy-pharma

How should we handle OPEN API keys?

_ ERROR collecting tests/unit_tests/agents/conversational_chat/test_output_parser.py _ tests/unit_tests/agents/conversational_chat/test_output_parser.py:11: in llm_chatgpt = ChatOpenAI(temperature=0.3, model_name="gpt-3.5-turbo") pydantic/main.py:341: in pydantic.main.BaseModel.init ??? E pydantic.error_wrappers.ValidationError: 1 validation error for ChatOpenAI E root E Did not find openai_api_key, please add an environment variable OPENAI_API_KEY which contains it, or pass openai_api_key as a named parameter. (type=value_error)

rickbraddy-pharma avatar May 23 '23 20:05 rickbraddy-pharma

The unit tester has been rewritten in accordance with the recommendations. Please re-run the workflow.

rickbraddy-pharma avatar May 24 '23 16:05 rickbraddy-pharma

Not sure why the latest commit is a conflict. The entire file was rewritten to replace the prior commit.

rickbraddy-pharma avatar May 24 '23 19:05 rickbraddy-pharma

Do the tests cover cases with nested triple backticks ? As shown below:

langchain.schema.OutputParserException: Could not parse LLM output: ```json { "action": "Final Answer", "action_input": "Here is example code\n```python\nprint('Hello World')\n```\n" } ```

I wonder how the function parse_json_markdown will work in this case?

json_string = json_string.replace("```json", "").replace("```", "")

removes all triple backticks, whereas the backticks in action_input should be preserved as they are part of the markdown.

schinto avatar May 25 '23 12:05 schinto

Yes, it covers the case where code gets returned within the JSON string, starting with triple backticks and followed by type of code... as shown in the working unit tester.

rickbraddy-pharma avatar May 25 '23 13:05 rickbraddy-pharma

Who should resolve the conflicts that have cropped up?

rickbraddy-pharma avatar May 25 '23 13:05 rickbraddy-pharma

Yes, it covers the case where code gets returned within the JSON string, starting with triple backticks and followed by type of code... as shown in the working unit tester.

Sorry, I could not see this nested case in the unit tester.

schinto avatar May 25 '23 14:05 schinto

It was this nested case, where '''python or '''javascript get returned when code is generated that caused the original issues with the output parser. This is what was fixed and why the unit tester was developed to test the code/no-code return cases.

I actually added logging to the output parser, captured actual returned code and non-code parser examples, then included those in the unit tester, to ensure they are correctly structured.

Please let me know if further refinements are needed to merge.

Thanks Rick

rickbraddy-pharma avatar May 25 '23 14:05 rickbraddy-pharma

It was this nested case, where '''python or '''javascript get returned when code is generated that caused the original issues with the output parser. This is what was fixed and why the unit tester was developed to test the code/no-code return cases.

I just tested with langchain version 0.0.184 and the parsing of code blocks including triple backticks fails, see error below:

File "/home/user1/scratch/conda/envs/lcpy39/lib/python3.9/site-packages/langchain/agents/conversational_chat/output_parser.py", line 24, in parse raise OutputParserException(f"Could not parse LLM output: {text}") from e langchain.schema.OutputParserException: Could not parse LLM output: ```json { "action": "Final Answer", "action_input": "Here's a Python script to remove triple backticks at the start and end of a string, allowing spaces:\n\n```python\ndef remove_triple_backticks(s):\n return s.strip().lstrip('```').rstrip('```')\n\ninput_string = ' ```example text``` '\noutput_string = remove_triple_backticks(input_string)\nprint(output_string)\n```\n\nThis script defines a function `remove_triple_backticks` that takes a string as input, removes any leading and trailing spaces using `strip()`, and then removes triple backticks from the start and end of the string using `lstrip()` and `rstrip()`. The example input string is processed and the result is printed." } ```

schinto avatar May 30 '23 07:05 schinto

Clearly this needs more rework...

rickbraddy-pharma avatar May 30 '23 15:05 rickbraddy-pharma