langchain Wrapper function for OpenAI Function in LLMChain

trafficstars

Description: This pull request does two things: 1) Add wrapper function for pydantic model, JSON Schema, and dictionary to function calls in LLM Chain. 2) Add an additional class for the output parser class so pydantic model returns a string that could be added to AIMessage for memory storage.
Issue: #6933
Dependencies: N/A
Tag maintainer: @baskaryan
Twitter handle: #7t_eric

Jul 06 '23 18:07 enjuichang

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment

Name	Status	Preview	Comments	Updated (UTC)
langchain	⬜️ Ignored (Inspect)			Jul 6, 2023 11:34pm

Jul 06 '23 18:07 vercel[bot]

@CodiumAI-Agent please review

Jul 06 '23 21:07 coditamar

PR Analysis

🎯 Main theme: Adding wrapper functions for OpenAI function in LLMChain and an additional class for the output parser
🔍 Description and title: Yes
📌 Type of PR: Enhancement
🧪 Relevant tests added: No
✨ Minimal and focused: Yes, the PR is focused on adding new functionalities to the LLMChain and does not include unrelated changes.
🔒 Security concerns: No, the PR does not introduce any obvious security concerns.

PR Feedback

💡 General PR suggestions: The PR is generally well-structured and follows good coding practices. However, it lacks tests for the new functionalities added. It would be beneficial to add tests to ensure the new functions and class behave as expected.
🤖 Code suggestions:
- relevant file: langchain/chains/openai_functions/converter.py suggestion content: Consider adding type hints for the 'pydantic_schema' parameter in the 'from_pydantic_create_openai_function_dict' function. This will improve code readability and help prevent potential type-related bugs. [important]
- relevant file: langchain/chains/openai_functions/converter.py suggestion content: The 'from_json_create_openai_function_dict' function currently assumes that the input is a valid JSON string. It would be a good idea to add error handling for the case where 'json.loads' fails due to an invalid input. [important]
- relevant file: langchain/output_parsers/openai_functions.py suggestion content: In the 'PydanticOutputFunctionsParserToString' class, the 'parse_result' method converts the result to a string. Consider handling the case where 'pydantic_args.dict()' fails or returns a non-string convertible value. [medium]
- relevant file: langchain/chains/openai_functions/converter.py suggestion content: The 'from_dict_create_openai_function_dict' function modifies the input dictionary by adding a 'type' key. This can lead to unexpected side effects if the caller does not expect the input to be modified. Consider creating a copy of the input dictionary before modifying it. [important]

How to use

Tag me in a comment '@CodiumAI-Agent' to ask for a new review after you update the PR. You can also tag me and ask any question, for example '@CodiumAI-Agent is the PR ready for merge?'

Jul 06 '23 21:07 CodiumAI-Agent

@CodiumAI-Agent please check again

Jul 06 '23 23:07 enjuichang

Question: please check again

Answer: It seems like there's no specific question provided for me to answer. If you have any specific questions or concerns about the code in this pull request, please provide them so I can give a detailed response. For example, you might want to ask about the purpose of certain functions, whether the code follows best practices, or if there are any potential improvements that could be made.

Jul 06 '23 23:07 CodiumAI-Agent

@CodiumAI-Agent please review

Jul 06 '23 23:07 enjuichang

PR Analysis

🎯 Main theme: Adding wrapper functions for OpenAI Function in LLMChain
🔍 Description and title: Yes
📌 Type of PR: Enhancement
🧪 Relevant tests added: No
✨ Minimal and focused: Yes, the PR is focused on adding wrapper functions for different types of input and an additional output parser class.
🔒 Security concerns: No, the PR does not introduce any obvious security concerns.

PR Feedback

💡 General PR suggestions: The PR is well-structured and the code is clean. However, it lacks tests for the new functionality. It's important to add tests to ensure the new functions work as expected and to prevent future regressions. Also, consider adding more detailed docstrings for the new functions, explaining what each argument is and what the function returns.
🤖 Code suggestions:
- relevant file: langchain/chains/openai_functions/converter.py suggestion content: In the from_json_create_openai_function_dict function, you're catching all exceptions when trying to parse the JSON. It would be better to catch specific exceptions (like json.JSONDecodeError) to avoid hiding potential bugs. [important]
- relevant file: langchain/chains/openai_functions/converter.py suggestion content: In the from_dict_create_openai_function_dict function, use isinstance(dict_schema, dict) instead of type(dict_schema) != dict for type checking. This is more Pythonic and allows for subclassing. [medium]
- relevant file: langchain/output_parsers/openai_functions.py suggestion content: In the PydanticOutputFunctionsParserToString class, you're converting the parsed result to a string. This could lead to loss of information if the parsed result is a complex object. Consider returning the raw object and let the caller decide how to handle it. [medium]
- relevant file: langchain/chains/openai_functions/converter.py suggestion content: The three functions from_pydantic_create_openai_function_dict, from_json_create_openai_function_dict, and from_dict_create_openai_function_dict have a lot of common code. Consider refactoring them to reduce code duplication. [medium]

How to use

Tag me in a comment '@CodiumAI-Agent' to ask for a new review after you update the PR. You can also tag me and ask any question, for example '@CodiumAI-Agent is the PR ready for merge?'

Jul 06 '23 23:07 CodiumAI-Agent

@enjuichang thanks for contribution! landed some similar functions here, wondering if that covers your use case

Jul 07 '23 18:07 baskaryan

hey @baskaryan! looks good. One thing that probably needs to change is the memory system. I don't know if you guys took a look into this but in ChatOpenAI, the AIMessage is appended as a string, so the output parser here would actually fail since the output is a pydantic class.

I added this temporary get around to solve this but let me know if you have any other ideas about this!

Jul 08 '23 01:07 enjuichang

hey @baskaryan! looks good. One thing that probably needs to change is the memory system. I don't know if you guys took a look into this but in ChatOpenAI, the AIMessage is appended as a string, so the output parser here would actually fail since the output is a pydantic class.

I added this temporary get around to solve this but let me know if you have any other ideas about this!

hm yea interesting point. my first thought is i'd probably create a custom memory that would convert the original output to str before saving it. that way you can still use the structured output in downstream logic

something like

class StructuredChatMemory(BaseChatMemory):

    def _get_input_output(
      self, inputs: Dict[str, Any], outputs: Dict[str, str]
    ) -> Tuple[str, str]        
      input, output = super()._get_input_output(inputs, outputs)
      if isinstance(output, BaseModel):
        output = output.json(indent=2)
      return input, output

what do you think?

Jul 08 '23 06:07 baskaryan

Yeah that looks good!

Jul 08 '23 15:07 enjuichang

closing pr for now, @enjuichang let me know if im missing anything!

Jul 12 '23 01:07 baskaryan

@baskaryan is this implemented/scheduled?

hey @baskaryan! looks good. One thing that probably needs to change is the memory system. I don't know if you guys took a look into this but in ChatOpenAI, the AIMessage is appended as a string, so the output parser here would actually fail since the output is a pydantic class. I added this temporary get around to solve this but let me know if you have any other ideas about this!

hm yea interesting point. my first thought is i'd probably create a custom memory that would convert the original output to str before saving it. that way you can still use the structured output in downstream logic

something like
class StructuredChatMemory(BaseChatMemory):

    def _get_input_output(
      self, inputs: Dict[str, Any], outputs: Dict[str, str]
    ) -> Tuple[str, str]        
      input, output = super()._get_input_output(inputs, outputs)
      if isinstance(output, BaseModel):
        output = output.json(indent=2)
      return input, output
what do you think?

Jul 18 '23 18:07 enjuichang

langchain langchain copied to clipboard

Wrapper function for OpenAI Function in LLMChain

PR Analysis

PR Feedback

How to use

PR Analysis

PR Feedback

How to use

langchain
langchain copied to clipboard