dd-trace-py
dd-trace-py copied to clipboard
ddtrace langchain code fails to work with vision API calls
Summary of problem
ddtrace's LangChain code fails with an error if Message has a nested data structure. OpenAI recently introduced this as part of GPT 4 Vision and it fails with this error. Previously, we had assumed that messages.content would always be a string (which was true until OpenAI recently added the image input feature to the chat completions endpoint), but it can now be an array of str-str dictionaries.
This issue is very similar to #7759 . We need to fix the LangChain code in the same way as the OpenAI code.
Which version of dd-trace-py are you using?
Tested on 2.2.2 but master is also having the same issue
Which version of pip are you using?
23.3.1
Which libraries and their versions are you using?
langchain==0.0.339
How can we reproduce your problem?
We can try running Langchain with a nested message and it fails with the error.
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Hello, I believe I'm a bug!",
},
]
}
]
What is the result that you get?
Traceback (most recent call last):
...
File "/opt/venv/lib/python3.10/site-packages/langchain/schema/runnable/base.py", line 2798, in ainvoke
return await self.bound.ainvoke(
File "/opt/venv/lib/python3.10/site-packages/langchain/schema/runnable/base.py", line 1458, in ainvoke
input = await step.ainvoke(
File "/opt/venv/lib/python3.10/site-packages/langchain/schema/runnable/base.py", line 2798, in ainvoke
return await self.bound.ainvoke(
File "/opt/venv/lib/python3.10/site-packages/langchain/chat_models/base.py", line 162, in ainvoke
llm_result = await self.agenerate_prompt(
File "/opt/venv/lib/python3.10/site-packages/langchain/chat_models/base.py", line 469, in agenerate_prompt
return await self.agenerate(
File "/opt/venv/lib/python3.10/site-packages/ddtrace/contrib/langchain/patch.py", line 448, in traced_chat_model_agenerate
integration.trunc(message.content),
File "/opt/venv/lib/python3.10/site-packages/ddtrace/contrib/_trace_utils_llm.py", line 138, in trunc
text = text.replace("\n", "\\n").replace("\t", "\\t")
AttributeError: 'list' object has no attribute 'replace'
What is the result that you expected?
No errors
Suggested fix
https://github.com/vamseeyarla/dd-trace-py/commit/697e486912b998e0f307b2ad517dfa6c90917ea2
CC @Yun-Kim - I see you have worked on #7759. Can you also please look into this?
Hi @vamseeyaria, thanks for reaching out!
I wasn't able to completely reproduce the traceback you listed. Could you give us some more details on exact reproduction steps to get the error? Thanks!
Hi @vamseeyarla, just following up to see if this is still an issue for you. Looking at LangChain's documentation and guides, it doesn't look like LangChain ChatModels officially support submitting messages as primitive dictionary objects (they only document constructing HumanMessage/SystemMessage
objects. The example case you provided doesn't seem to work with LangChain - can you confirm/clarify the exact message structure/argument you're passing in? Thanks!
Closing this for now, but feel free to reopen when you get a chance to give us some more information.
@Yun-Kim - Sorry for the delay. Here is a test that I put together on lines of #7759 to show that using nested structures fail for LangChain.
langchain-vvision-failure-test.patch -- On main branch
image_url = (
"https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk"
".jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
)
chat = langchain.chat_models.ChatOpenAI(temperature=0, max_tokens=256)
with request_vcr.use_cassette("openai_chat_completion_sync_generate.yaml"):
chat.generate(
[
[
langchain.schema.HumanMessage(
content=[
{"type": "text", "text": "What’s in this image?"},
{
"type": "image_url",
"image_url": image_url,
},
],
),
],
]
)
And I am also posting the patch that works for my use case: fix_8419.patch
@Yun-Kim - It looks like the langchain
version that is being used by dd-trace-py is quite old: langchain==0.0.192
This issue is happening on later versions of Langchain - langchain==0.0.339
@Yun-Kim - Thanks for reopening this ticket. As you can see in the later versions of Langchain, the Message content is extended to support both strings and nested structures like dictionary.
https://github.com/langchain-ai/langchain/blob/v0.0.339/libs/langchain/langchain/schema/messages.py#L67
content: Union[str, List[Union[str, Dict]]]