langchain icon indicating copy to clipboard operation
langchain copied to clipboard

New mypy type error from PydanticOutputParser

Open rpgoldman opened this issue 1 year ago • 3 comments

Checked other resources

  • [X] I added a very descriptive title to this issue.
  • [X] I searched the LangChain documentation with the integrated search.
  • [X] I used the GitHub search to find a similar question and didn't find it.
  • [X] I am sure that this is a bug in LangChain rather than my code.
  • [X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

The following code used to pass mypy checks until I updated langchain_core from 0.1.31 to 0.1.44

from langchain.output_parsers import PydanticOutputParser
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.language_models import BaseChatModel

class InputSourceResponse(BaseModel):
    """
    Response for Input Source Query
    """
    input_sources: dict[str, str] = Field(
        description=('Input source dictionary for the provided function where '
                     'key is <SUPERTYPE> and value is <SUBTYPE> '
                     '(set <SUBTYPE> to NONE if no subtype exists).'
                     )
    )
    explanation: str = Field(
        description='Explanation for input sources for the provided function')

def do_parser(llm: BaseChatModel, input: str) -> InputSourceResponse:
    parser = PydanticOutputParser(pydantic_object=InputSourceResponse)
    res: InputSourceResponse = (llm | parser).invoke(input)
    return res

Now I get this mypy error message:

lacrosse_llm/langchain_bug.py:19: error: Value of type variable "TBaseModel" of "PydanticOutputParser" cannot be "InputSourceResponse"  [type-var]

Error Message and Stack Trace (if applicable)

langchain_bug.py:19: error: Value of type variable "TBaseModel" of "PydanticOutputParser" cannot be "InputSourceResponse"  [type-var]

Description

  • I defined a PydanticOutputParser that used to typecheck as correct, but now it seems that it does not recognize my InputSourceResponse as being an acceptable value for TBaseModel.
  • TBaseModel is defined as follows in pydantic.py:
    if PYDANTIC_MAJOR_VERSION < 2:
        PydanticBaseModel = pydantic.BaseModel
    
    else:
        from pydantic.v1 import BaseModel  # pydantic: ignore
    
        # Union type needs to be last assignment to PydanticBaseModel to make mypy happy.
        PydanticBaseModel = Union[BaseModel, pydantic.BaseModel]  # type: ignore
    
    TBaseModel = TypeVar("TBaseModel", bound=PydanticBaseModel)
    
  • As far as I can tell, this should be OK -- it seems now that expecting the PydanticOutputParser to produce an instance of its pydantic_object is somehow failing. Note that the line that just defines parser above is not sufficient to cause mypy to error. So something seems to be wrong in terms of the production of the result of invoking the parser.

System Info

langchain==0.1.16 langchain-anthropic==0.1.11 langchain-community==0.0.33 langchain-core==0.1.44 langchain-google-vertexai==1.0.1 langchain-openai==0.0.8 langchain-text-splitters==0.0.1

MacOS 14.4.1

python version = 3.12.2

rpgoldman avatar Apr 19 '24 01:04 rpgoldman

Not sure this is correct, but it might be related to the merge of #18811, in which case it's not the update to langchain-core, but to langchain 0.1.16

rpgoldman avatar Apr 19 '24 01:04 rpgoldman

Interestingly, the following code snippet does not cause mypy to report a type error:

def make_parser() -> PydanticOutputParser:
    return PydanticOutputParser(pydantic_object=InputSourceResponse)

However, the following, which should be semantically identical, gives the same error:

def make_parser() -> PydanticOutputParser:
    parser = PydanticOutputParser(pydantic_object=InputSourceResponse)
    return parser

On the other hand, this does not:

def make_parser() -> PydanticOutputParser:
    return PydanticOutputParser(pydantic_object=InputSourceResponse)

parser = make_parser()

The following is also OK:

def make_parser() -> PydanticOutputParser:
    return PydanticOutputParser(pydantic_object=InputSourceResponse)

def doit():
    parser = make_parser()
    return parser

Somehow the assignment from the constructor makes all the difference!

rpgoldman avatar Apr 19 '24 15:04 rpgoldman

I believe I have found the problem:

print(isinstance(InputSourceResponse, langchain_core.output_parsers.pydantic.PydanticBaseModel))

prints False.

Digging further, this is because PYDANTIC_MAJOR_VERSION == 2 despite the fact that I am using the pydantic_v1 library!

This suggests that either

  • use of langchain_core.pydantic_v1 should be an error (i.e., programmers should be forbidden from using it if PYDANTIC_MAJOR_VERSION > 1), or
  • PYDANTIC_MAJOR_VERSION is defined and/or used incorrectly, or
  • langchain_core.output_parsers.pydantic.PydanticBaseModel is defined incorrectly, or
  • there's a bug in mypy.

I'm not familiar enough with the architecture to know which of these alternatives is correct, but I believe that it was not the intent to remove support for Pydantic v1, was it?

rpgoldman avatar Apr 19 '24 15:04 rpgoldman