openllmetry icon indicating copy to clipboard operation
openllmetry copied to clipboard

🚀 Feature: re-write Langchain instrumentation to use Langchain Callbacks

Open nirga opened this issue 1 year ago • 7 comments

Which component is this feature for?

Langchain Instrumentation

🔖 Feature description

Right now, we monkey-patch classes and methods in LlamaIndex which requires endless work and constant maintenance. Langchain has a system for callbacks that can potentially be used to create/end spans without being too coupled with with the framework's inner structure.

🎤 Why is this feature needed ?

Support Langchain entirely and be future-proof to internal API changes

✌️ How do you aim to achieve this?

Look into Langchain callbacks and how other frameworks are using it.

🔄️ Additional Information

No response

👀 Have you spent some time to check if this feature request has been raised before?

  • [X] I checked and didn't find similar issue

Are you willing to submit PR?

None

nirga avatar Feb 27 '24 14:02 nirga

Chainlit uses langchain's callbacks for their instrumentation, and it seems to work well enough. They inherit from langchain's BaseTracer class as well and do their observability through callbacks, as well as some front-end functions (ex: updating the user-facing message on each token). Langchain does allow for multiple independent callbacks to be specified, so doing it this way doesn't exclude any user-created callbacks.

maciejwie avatar Mar 13 '24 18:03 maciejwie

Hi @nirga , I have a rough idea on the implementation now and I am willing to pick up this issue. After reading your comments in https://github.com/traceloop/openllmetry-js/issues/133#issuecomment-1999619802 here is my understanding:

  1. Create a call handler similar to StdOutCallbackHandler in the same directory as opentelemetry-instrumentation-langchain
  2. Modify the task_wrapper, atask_wrapper, workflow_wrapper, and aworkflow_wrapper to inject the callback handler to the instance and use the callback to set/unset span. A small doubt here is that do we still need a task_wrapper.py and workflow_wrapper.py? Can I add the injecting logic to __init__.py itself. Which one's the right approach?
  3. The callback handler should be injected for Langchain classes instances which support callbacks.

Please correct me if my understanding is incorrect. Thanks, Midhun

midhun1998 avatar Mar 17 '24 17:03 midhun1998

Right @midhun1998! And indeed we probably don't need them all

nirga avatar Mar 17 '24 17:03 nirga

Thanks for the confirmation, Nir! I will keep the issue updated with the progress. 🙂

midhun1998 avatar Mar 17 '24 17:03 midhun1998

Hi @nirga ,

I tried the approach suggested and met with some roadblocks. Need your input. Below are the details:

Observation and Notes:

  • I have added the initial set of changes to my fork here: https://github.com/traceloop/openllmetry/compare/main...midhun1998:openllmetry:feat/langchain-callback (Please ignore the key of _span_dict as of now. My plan to use UUID or something as key instead of name. Its just a placeholder now.)
  • We are injecting the callback handler to the constructor of the class as expected but instead of start_as_current_span we will have to call start_span and end_span manually now.

Challenges:

  1. I observed that since we are no longer calling the start_as_current_span() we might need some way to attach the span to the parent span if any and this is where I'm facing an issue and would appreciate your input. There seems to be no parent span during the creation of the spans in the callback handler E.g. When SequentialChain was calling LLMChain I was expecting the SequentialChain span to persist and be taken as a parent Span but that was not the case. The traceloop UI shows the span but it detached individual spans without any parent. I believe this has to do with the constructor initialization that we are doing. image

  2. The callback handler only applies to very few classes such as Chain, Agent, Tool, and lacks the support for other classes which were used earlier such as Template, BasePromptTemplate BaseOutputParser, RunnableSequence, etc. How are we looking to support these? Do we maintain the monkey patching for methods for the others?

midhun1998 avatar Mar 26 '24 19:03 midhun1998

Linking here our slack conversation so I'll remember that we've discussed and answered these already 😅

nirga avatar Apr 06 '24 12:04 nirga