dd-trace-py icon indicating copy to clipboard operation
dd-trace-py copied to clipboard

ddtrace langchain code fails to work with vision API calls

Open vamseeyarla opened this issue 1 year ago • 8 comments

Summary of problem

ddtrace's LangChain code fails with an error if Message has a nested data structure. OpenAI recently introduced this as part of GPT 4 Vision and it fails with this error. Previously, we had assumed that messages.content would always be a string (which was true until OpenAI recently added the image input feature to the chat completions endpoint), but it can now be an array of str-str dictionaries.

This issue is very similar to #7759 . We need to fix the LangChain code in the same way as the OpenAI code.

Which version of dd-trace-py are you using?

Tested on 2.2.2 but master is also having the same issue

Which version of pip are you using?

23.3.1

Which libraries and their versions are you using?

langchain==0.0.339

How can we reproduce your problem?

We can try running Langchain with a nested message and it fails with the error.

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Hello, I believe I'm a bug!",
            },
        ]
    }
]

What is the result that you get?

Traceback (most recent call last):
...
  File "/opt/venv/lib/python3.10/site-packages/langchain/schema/runnable/base.py", line 2798, in ainvoke
    return await self.bound.ainvoke(
  File "/opt/venv/lib/python3.10/site-packages/langchain/schema/runnable/base.py", line 1458, in ainvoke
    input = await step.ainvoke(
  File "/opt/venv/lib/python3.10/site-packages/langchain/schema/runnable/base.py", line 2798, in ainvoke
    return await self.bound.ainvoke(
  File "/opt/venv/lib/python3.10/site-packages/langchain/chat_models/base.py", line 162, in ainvoke
    llm_result = await self.agenerate_prompt(
  File "/opt/venv/lib/python3.10/site-packages/langchain/chat_models/base.py", line 469, in agenerate_prompt
    return await self.agenerate(
  File "/opt/venv/lib/python3.10/site-packages/ddtrace/contrib/langchain/patch.py", line 448, in traced_chat_model_agenerate
    integration.trunc(message.content),
  File "/opt/venv/lib/python3.10/site-packages/ddtrace/contrib/_trace_utils_llm.py", line 138, in trunc
    text = text.replace("\n", "\\n").replace("\t", "\\t")
AttributeError: 'list' object has no attribute 'replace'

What is the result that you expected?

No errors

Suggested fix

https://github.com/vamseeyarla/dd-trace-py/commit/697e486912b998e0f307b2ad517dfa6c90917ea2

vamseeyarla avatar Jan 19 '24 02:01 vamseeyarla

CC @Yun-Kim - I see you have worked on #7759. Can you also please look into this?

vamseeyarla avatar Jan 19 '24 03:01 vamseeyarla

Hi @vamseeyaria, thanks for reaching out!

I wasn't able to completely reproduce the traceback you listed. Could you give us some more details on exact reproduction steps to get the error? Thanks!

Yun-Kim avatar Jan 19 '24 18:01 Yun-Kim

Hi @vamseeyarla, just following up to see if this is still an issue for you. Looking at LangChain's documentation and guides, it doesn't look like LangChain ChatModels officially support submitting messages as primitive dictionary objects (they only document constructing HumanMessage/SystemMessage objects. The example case you provided doesn't seem to work with LangChain - can you confirm/clarify the exact message structure/argument you're passing in? Thanks!

Yun-Kim avatar Jan 23 '24 19:01 Yun-Kim

Closing this for now, but feel free to reopen when you get a chance to give us some more information.

Yun-Kim avatar Feb 13 '24 15:02 Yun-Kim

@Yun-Kim - Sorry for the delay. Here is a test that I put together on lines of #7759 to show that using nested structures fail for LangChain.

langchain-vvision-failure-test.patch -- On main branch

image_url = (
        "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk"
        ".jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
    )
    chat = langchain.chat_models.ChatOpenAI(temperature=0, max_tokens=256)
    with request_vcr.use_cassette("openai_chat_completion_sync_generate.yaml"):
        chat.generate(
            [
                [
                    langchain.schema.HumanMessage(
                        content=[
                            {"type": "text", "text": "What’s in this image?"},
                            {
                                "type": "image_url",
                                "image_url": image_url,
                            },
                        ],
                    ),
                ],
            ]
        )

vamseeyarla avatar Mar 13 '24 18:03 vamseeyarla

And I am also posting the patch that works for my use case: fix_8419.patch

vamseeyarla avatar Mar 13 '24 18:03 vamseeyarla

@Yun-Kim - It looks like the langchain version that is being used by dd-trace-py is quite old: langchain==0.0.192

This issue is happening on later versions of Langchain - langchain==0.0.339

vamseeyarla avatar Mar 13 '24 21:03 vamseeyarla

@Yun-Kim - Thanks for reopening this ticket. As you can see in the later versions of Langchain, the Message content is extended to support both strings and nested structures like dictionary.

https://github.com/langchain-ai/langchain/blob/v0.0.339/libs/langchain/langchain/schema/messages.py#L67

content: Union[str, List[Union[str, Dict]]]

vamseeyarla avatar Mar 14 '24 16:03 vamseeyarla