Feature request: Instrumentation for popular GenAI frameworks
Use case
With the rise in popularity of GenAI, there are many developers and businesses building new GenAI workloads on AWS Lambda. Frameworks like Langchain and LlamaIndex make it easy to build GenAI applications. When using these frameworks, it is useful to know what's happening inside of the framework to debug latency issues, quickly identify inputs and outputs from LLM calls, and create alarms when issues arise. To implement this level of observability, the frameworks rely on Callback mechanisms (e.g. LangChain callbacks, LlamaIndex callbacks).
I am proposing a new module/feature for Powertools that enables Powertools users to simply import a library, and have automatic Traces generated from code that runs within these frameworks. By providing this feature in Powertools, developers can have Traces directly in AWS X-Ray without the need of implementing their custom callbacks handlers, and without the need of write and maintain additional code.
I would suggest implementing this feature for LangChain initially, and to cover the Traces aspect to begin with. Later on it may also be useful to have additional features such creating relevant metrics, and automatic instrumentation.
Solution/User Experience
The experience could look like this:
// import the library
import { GenAIObservabilityCallbackHandler } from '@aws-lambda-powertools/genai-observability-provider';
// Create an instance of the handler
const callbackHandler = new GenAIObservabilityCallbackHandler();
...
// Create a sample chain using Langchain
const chain = RunnableSequence.from([
{
language: (input) => input.language,
},
prompt,
model,
new StringOutputParser(),
])
.withConfig({ callbacks: [callbackHandler] }); // configure chain with instance of the callback Handler
Alternative solutions
Today, possible solutions are 2:
-
build a Callback handler, and leverage Powertools for AWS Lambda to send Traces to AWS X-Ray
-
use OTEL-compatible libraries like OpenLLMetry and OpenInference to collect the execution data, and configure the AWS Distro for OpenTelemetry
Acknowledgment
- [X] This feature request meets Powertools for AWS Lambda (TypeScript) Tenets
- [X] Should this be considered in other Powertools for AWS Lambda languages? i.e. Python, Java, and .NET
Future readers
Please react with 👍 and your use case to help us understand customer demand.
Thanks for opening your first issue here! We'll come back to you as soon as we can. In the meantime, check out the #typescript channel on our Powertools for AWS Lambda Discord: Invite link
Hi @mriccia - thank you for taking the time to open an issue.
Following our roadmap process I am marking this as idea and labelling it with need-customer-feedback to signify that we'd like to gauge customer demand before making a call.
At least for now, as reported our roadmap, we are focused on other key areas that have shown significant customer demand both via reactions on their respective issues (sorted list of most requested issues) and via private conversations.
Similar to most AWS products we continuously re evaluate our roadmap based on demand, so I would encourage everyone who comes across this issue to either leave a 👍 to the OP, leave a comment, or connect with us via our public and private channels to manifest interest in this feature. Depending on the demand we might consider adjusting our prioritization accordingly.
Finally, I'm also tagging @aws-powertools/lambda-python-core and @aws-powertools/lambda-dotnet-core to surface the issue since it was marked as cross-runtime. I have also done the same internally. Also, since you're also an Amazonian, if you have existing customer demand for this, please feel free to reach out via the usual channels.
Thank you!
The team has been very active in this area during the first half of the year and have contributed to several projects like CrewAI, smolagents, Pydantic-AI, agno, CopilotKit, and Vercel AI SDK. With this experience it's become clear that the GenAI space is continuously evolving, and there’s no single “winner” when it comes to popular frameworks.
Projects like LangChain, which seemed to be the go-to framework less than a year ago, haven’t become the dominant GenAI framework. Instead, new frameworks emerge and gain significant traction as new patterns emerge and models are released.
Because of this, building on top of any of specific framework is not a future-proof choice for Powertools for AWS and we would rather invest in improving DX in areas for GenAI use cases as a whole with features like Bedrock Agents Function (#3710) and potentially MCP Servers sometime soon.
With this in mind I'm closing the issue as not planned for now.
⚠️ COMMENT VISIBILITY WARNING ⚠️
This issue is now closed. Please be mindful that future comments are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.