Bug: Significant Cold Start Initialization Delay in Orchestration Framework
Expected Behaviour
The orchestration framework should initialize with minimal delay during cold starts, ensuring consistent performance across all Lambda invocations, including the first execution.
Current Behaviour
When a new Lambda instance is created, the orchestration framework incurs a significant initialization delay of 5-7 seconds. This delay occurs before any classification or other processing steps, resulting in a total latency of 7-9 seconds for the first invocation.
However, subsequent invocations on the same Lambda instance execute within 1-2 seconds, indicating the issue is specific to cold start initialization. This behavior impacts the overall performance and user experience during the first execution.
Note: Testing with provisioned concurrency (set to 5) does keep 5 instances of the Lambda function warm; however, the issue persists when execution shifts to a new Lambda instance outside of these provisioned instances. The orchestration framework initialization itself takes 5-7 seconds, which adversely impacts performance and user experience during the first execution.
Code snippet
Logs for the initial invocation:
INIT_START Runtime Version: python:3.12.v38 Runtime Version ARN: arn:aws:lambda:us-east-1::runtime:7515e00d6763496e7a147ffa395ef5b0f0c1ffd6064130abb5ecde5a6d630e86
START RequestId: 29a54971-892b-4748-affc-2e4e9a6139db Version: 96
[INFO] 2024-12-02T16:55:56.989Z 29a54971-892b-4748-affc-2e4e9a6139db Received event in Lambda
[INFO] 2024-12-02T16:55:56.990Z 29a54971-892b-4748-affc-2e4e9a6139db LLM Classification
[INFO] 2024-12-02T16:55:**57**.513Z 29a54971-892b-4748-affc-2e4e9a6139db Found credentials in environment variables.
[INFO] 2024-12-02T16:56:**04**.180Z 29a54971-892b-4748-affc-2e4e9a6139db
** CLASSIFIED INTENT **
Logs for the other invocations in same lambda instance:
[INFO] 2024-12-02T16:57:00.674Z b6b59439-3a0c-48b5-b763-1e0a2c30ca51 Received event:
[INFO] 2024-12-02T16:57:**00**.674Z b6b59439-3a0c-48b5-b763-1e0a2c30ca51 LLM Classification
[INFO] 2024-12-02T16:57:**01**.311Z b6b59439-3a0c-48b5-b763-1e0a2c30ca51
** CLASSIFIED INTENT **
Possible Solution
No response
Steps to Reproduce
- Deploy the orchestration framework on AWS Lambda.
- Trigger a request that leads to the creation of a new Lambda instance.
- Measure the total latency for the first invocation (observe 7-9 seconds).
- Measure latency for subsequent invocations on the same instance (observe 1-2 seconds).
- Test with provisioned concurrency and observe behavior when execution moves to new instances.
Hi @karanakatle , thanks for submitting this issue. There are a couple of things we need to understand before:
- Can you share the code you used for this lambda?
- Have you seen that now, you can install the multi-agent-orchestrator with only the minimum required dependency. For instance, if you do not use Anthropic or OpenAi you can simple install multi-agent-orchestrator using:
pip install multi-agent-orchestratorthis is available from version 0.1.1 This will save you a bit of init time. - You can also check the snapstart for python. Which has been released last week.
Let us know if you need further assistance. regards, Anthony
@karanakatle any updates on this?
@karanakatle , I'm about to close this since I didn't hear anything from you. Let me know if you need further assistance.
Hello @brnaba-aws Sorry for delay response, I tried enabling snapstart and installing the required dependencies only but the issue persists. On detail analysis, this is what I have found.
- Every time the library import takes 2 seconds.
- We have used a custom classifier - where we are using our fine tune model for getting a response - it provides a response in 0.6 to 1 sec of time.
- If no agent is selected - it goes to fallback agent - which is bedrock agent which again takes 2-3 sec. of time
Is there any chance or way to save the time consuming in 1st and 3rd point
Code is attached for reference. multi_agent_orchestrator.zip
Cloudwatch logs also attached for analysis Bot Orchestration Logs.txt
For:
- can you provide a log for more than a single invocation? To see if this time is always there or only on the very first invocation.
- You can't really improve that. Unless you don't use a default agent by setting: USE_DEFAULT_AGENT_IF_NONE_IDENTIFIED=False, Or use another model for the default agent to be a fast one like claude 3 haiku. I see that your bedrock_llm_agent is using Claude Sonnet 3.5, which is slow.
@karanakatle , one more thing. I don't think you are using the latest python version since I don't see this try/except with Anthropic: new import method
One thing that I just noticed is the fact that each Lex bot will instantiate a boto3.client('lexv2-runtime', region_name=self.region). We haven't provided a way to reuse a client passed as a parameter. This would help I believe. I'll create an issue, and provide you with a file to test? ok?
Sure @brnaba-aws , thnx for the help.
@karanashokraokatle,
could you please try to use this LexBotAgent? It can accept a client as an option, so you can create a single client and use it across all your lex bot.
example:
lex_client = boto3.client('lexv2-runtime', region_name=os.getenv('AWS_REGION','us-east-1'))
my_agent = LexBotAgent(LexBotAgentOptions(client=lex_client, bot_id = '', bot_alias_id='', locale_id=''))
my_agent_2 = LexBotAgent(LexBotAgentOptions(client=lex_client, bot_id = '', bot_alias_id='', locale_id=''))
from typing import List, Dict, Optional
from dataclasses import dataclass
import boto3
from botocore.exceptions import BotoCoreError, ClientError
from multi_agent_orchestrator.agents import Agent, AgentOptions
from multi_agent_orchestrator.types import ConversationMessage, ParticipantRole
from multi_agent_orchestrator.utils import Logger
import os
from typing import Any
@dataclass
class LexBotAgentOptions(AgentOptions):
bot_id: str = None
bot_alias_id: str = None
locale_id: str = None
client: Optional[Any] = None
class LexBotAgent(Agent):
def __init__(self, options: LexBotAgentOptions):
super().__init__(options)
if (options.region is None):
self.region = os.environ.get("AWS_REGION", 'us-east-1')
else:
self.region = options.region
if options.client:
self.lex_client = options.client
else:
self.lex_client = boto3.client('lexv2-runtime', region_name=self.region)
self.bot_id = options.bot_id
self.bot_alias_id = options.bot_alias_id
self.locale_id = options.locale_id
if not all([self.bot_id, self.bot_alias_id, self.locale_id]):
raise ValueError("bot_id, bot_alias_id, and locale_id are required for LexBotAgent")
async def process_request(self, input_text: str, user_id: str, session_id: str,
chat_history: List[ConversationMessage],
additional_params: Optional[Dict[str, str]] = None) -> ConversationMessage:
try:
params = {
'botId': self.bot_id,
'botAliasId': self.bot_alias_id,
'localeId': self.locale_id,
'sessionId': session_id,
'text': input_text,
'sessionState': {} # You might want to maintain session state if needed
}
response = self.lex_client.recognize_text(**params)
concatenated_content = ' '.join(
message.get('content', '') for message in response.get('messages', [])
if message.get('content')
)
return ConversationMessage(
role=ParticipantRole.ASSISTANT.value,
content=[{"text": concatenated_content or "No response from Lex bot."}]
)
except (BotoCoreError, ClientError) as error:
Logger.error(f"Error processing request: {str(error)}")
raise error
@karanakatle , We haven't heard from you in a while. Can we close this?
Closing this issue due to inactivity. Please feel free to reopen if you'd like to continue the discussion.