fix(langchain): retain parent ctx and ensure appropriate llm span is created
- [ ] I have added tests that cover my changes.
- [x] If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
- [x] PR name follows conventional commits format:
feat(instrumentation): ...orfix(instrumentation): .... - [ ] (If applicable) I have updated the documentation accordingly.
Fixes #2271
Reproducible code from bug report
Courtesy #2271. Thanks @jemo21k! Expected behaviour: LLM span should be created.
Before
No openai.chat span.
After
openai.chat span is present.
sample_app/langgraph_example
Expected behaviour: No change. Additional LLM span should not be created since ChatOpenAI callback already creates one.
Before
After
[!IMPORTANT] Fixes LLM span creation and context retention in Langchain and OpenAI instrumentation by setting default values for missing attributes and ensuring spans are correctly attached to context.
- Behavior:
- Fixes missing
openai.chatspan creation incallback_handler.pyby ensuringLLM_RESPONSE_MODELis set tomodelor"unknown".- Ensures parent context is retained and appropriate LLM span is created in
_create_llm_span()incallback_handler.py.- Attributes:
- Sets
LLM_RESPONSE_MODELtomodelor"unknown"in_build_from_streaming_response()and_abuild_from_streaming_response()inchat_wrappers.py.- Updates
messages_list_wrapper()andruns_create_and_stream_wrapper()inassistant_wrappers.pyto setLLM_RESPONSE_MODELtomodelor"unknown".- Misc:
- Minor updates to ensure spans are correctly attached to context in
callback_handler.py.This description was created by
for 08116d985397a18e342bd5764ae3e12fe83ae503. It will automatically update as commits are pushed.
Hi @nirga, sorry it's taking time, I'm still working on getting the CLA signed.
Took a look a the test failures. The langchain test failure happening on SequentialChain raises some suspicions for me. It looks like context isn't updated / propagated during on_llm_start for SequentialChain.ainvoke. What seems to be happening is the OpenAI instrumentation is triggered (asynchronously?) as a child of the task span as opposed to as the child of the _create_llm_span call, and hence ends up not seeing SUPPRESS_LANGUAGE_MODEL_INSTRUMENTATION_KEY
So far I'm wondering if this is because the langchain instrumentation uses BaseCallbackHandler and (probably) missing an AsyncCallbackHandler. Some references I was reading into:
- https://python.langchain.com/docs/concepts/callbacks/#callback-handlers
- https://python.langchain.com/docs/how_to/callbacks_async/ The second link mentions this:
If you use a sync CallbackHandler while using an async method to run your LLM / Chain / Tool / Agent, it will still work. However, under the hood, it will be called with [run_in_executor](https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.run_in_executor) which can cause issues if your CallbackHandler is not thread-safe.
I tried adding opentelemetry-instrumentation-threading to the test to see if it solves the context propagation but it didn't help.
Suggestions would be really helpful here, so I'm curious to hear your thoughts.
@nirga Please review the PR at your convenience. CLA is signed.
Added test based on the sample code, but noticing some unexpected behaviour that I'm still debugging.
- If I run the sample code in isolation, I'm able to successfully validate this PR
- If I run the same code through the test, it does not produce spans
openai.chatat all, as well as forsome other operations. For what it's worth, I tried to link our own LLM app to this PR, and saw theopenai.chatspans produced correctly.
My analysis so far:
Somehow during the test, the SUPPRESS_LANGUAGE_MODEL_INSTRUMENTATION_KEY is set to True, so the openai instrumentor ends up not creating a span at all. However, going by this PR, this context key is only set as part of _create_llm_span in langchain instrumentor, which isn't even called through the test code, so I'm not sure where else or how it's getting set.
Found the reason
Added test based on the sample code, but noticing some unexpected behaviour that I'm still debugging.
* If I run the sample code in isolation, I'm able to successfully validate this PR * If I run the same code through the test, it does not produce spans `openai.chat` at all, as well as forsome other operations. For what it's worth, I tried to link our own LLM app to this PR, and saw the `openai.chat` spans produced correctly.My analysis so far: Somehow during the test, the
SUPPRESS_LANGUAGE_MODEL_INSTRUMENTATION_KEYis set to True, so the openai instrumentor ends up not creating a span at all. However, going by this PR, this context key is only set as part of_create_llm_spanin langchain instrumentor, which isn't even called through the test code, so I'm not sure where else or how it's getting set.
This happens because all the (langchain) tests are happening under a shared trace context (and hence all spans nest under the same trace). So, test A may set a context key and test B ends up accidentally reading it in its OpenAI instrumentor. Adding a pytest fixture that starts a root span before each test solves this. However some cases tend to fail now, which need fixing.
Thanks @obs-gh-abhishekrao but can you help me understand what does this changes in terms of functionality? I'm not sure it resolves the context propagation issue, right?
Thanks for reviewing again @nirga. Apologies for the flurry of commits.
Initially, I thought I could address the missing LLM span as well as fix the parent span scope from the langchain task / workflow that triggers it. However, based on @ronensc's feedback and further discovery, context.attach and context.detach don't work predictably within the scope of langchain callbacks.
To sum it up - the PR will address
- The main concern i.e., missing LLM spans
- The newly created LLM spans did not have many span attributes. I addressed that as well.
Context will not be propagated from langchain to downstreams unfortunately, and this means the new LLM spans get disconnected from the trace.
@obs-gh-abhishekrao Building on one of your earlier suggestions, would it be possible to implement context.attach() and context.detach() around the non-async callbacks to handle context propagation at least in those cases, and leave the async cases for future work?
@obs-gh-abhishekrao Building on one of your earlier suggestions, would it be possible to implement
context.attach()andcontext.detach()around the non-async callbacks to handle context propagation at least in those cases, and leave the async cases for future work?
@ronensc Ack. I think this is doable. I'll update the PR once I have it.
LGTM @obs-gh-abhishekrao Thank you for the hard work!
Should we add a TODO comment for the async case, or document that it currently only supports the non-async path?
@nirga Could you please review the PR as well?
Thanks for reviewing and approving this PR @ronensc! Just made the changes you last proposed.
@obs-gh-abhishekrao can you rebase from main? I don't have access to modify the PR :/
Done.
@obs-gh-abhishekrao can you give me write permissions to your PR? It'll be impossible to merge if I can't rebase myself
@obs-gh-abhishekrao To allow edits by maintainers on this PR, you simply mark the Allow edits from maintainers checkbox as described here: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
@obs-gh-abhishekrao To allow edits by maintainers on this PR, you simply mark the
Allow edits from maintainerscheckbox as described here: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests
Sorry for the hassle. I don't see any such option. Apparently this is a limitation when a fork was made from an org :(. https://github.com/orgs/community/discussions/5634. I'm checking to see what my options are. Worst case, I'll have to re-create this PR off a personal fork.
@nirga I've submitted a different PR here - https://github.com/traceloop/openllmetry/pull/3094. I see the option to allow maintainer edits now, so it should work.
Since #3201 has been merged, I believe we can close this PR.