agent-squad icon indicating copy to clipboard operation
agent-squad copied to clipboard

Bug: Significant Cold Start Initialization Delay in Orchestration Framework

Open karanakatle opened this issue 1 year ago • 8 comments

Expected Behaviour

The orchestration framework should initialize with minimal delay during cold starts, ensuring consistent performance across all Lambda invocations, including the first execution.

Current Behaviour

When a new Lambda instance is created, the orchestration framework incurs a significant initialization delay of 5-7 seconds. This delay occurs before any classification or other processing steps, resulting in a total latency of 7-9 seconds for the first invocation.

However, subsequent invocations on the same Lambda instance execute within 1-2 seconds, indicating the issue is specific to cold start initialization. This behavior impacts the overall performance and user experience during the first execution.

Note: Testing with provisioned concurrency (set to 5) does keep 5 instances of the Lambda function warm; however, the issue persists when execution shifts to a new Lambda instance outside of these provisioned instances. The orchestration framework initialization itself takes 5-7 seconds, which adversely impacts performance and user experience during the first execution.

Code snippet

Logs for the initial invocation:

INIT_START Runtime Version: python:3.12.v38	Runtime Version ARN: arn:aws:lambda:us-east-1::runtime:7515e00d6763496e7a147ffa395ef5b0f0c1ffd6064130abb5ecde5a6d630e86
START RequestId: 29a54971-892b-4748-affc-2e4e9a6139db Version: 96
[INFO]	2024-12-02T16:55:56.989Z	29a54971-892b-4748-affc-2e4e9a6139db	Received event in Lambda
[INFO]	2024-12-02T16:55:56.990Z	29a54971-892b-4748-affc-2e4e9a6139db	LLM Classification
[INFO]	2024-12-02T16:55:**57**.513Z	29a54971-892b-4748-affc-2e4e9a6139db	Found credentials in environment variables.
[INFO]	2024-12-02T16:56:**04**.180Z	29a54971-892b-4748-affc-2e4e9a6139db	
** CLASSIFIED INTENT **


Logs for the other invocations in same lambda instance:

[INFO]	2024-12-02T16:57:00.674Z	b6b59439-3a0c-48b5-b763-1e0a2c30ca51	Received event: 
[INFO]	2024-12-02T16:57:**00**.674Z	b6b59439-3a0c-48b5-b763-1e0a2c30ca51	LLM Classification
[INFO]	2024-12-02T16:57:**01**.311Z	b6b59439-3a0c-48b5-b763-1e0a2c30ca51	
** CLASSIFIED INTENT **

Possible Solution

No response

Steps to Reproduce

  1. Deploy the orchestration framework on AWS Lambda.
  2. Trigger a request that leads to the creation of a new Lambda instance.
  3. Measure the total latency for the first invocation (observe 7-9 seconds).
  4. Measure latency for subsequent invocations on the same instance (observe 1-2 seconds).
  5. Test with provisioned concurrency and observe behavior when execution moves to new instances.

karanakatle avatar Dec 03 '24 03:12 karanakatle

Hi @karanakatle , thanks for submitting this issue. There are a couple of things we need to understand before:

  • Can you share the code you used for this lambda?
  • Have you seen that now, you can install the multi-agent-orchestrator with only the minimum required dependency. For instance, if you do not use Anthropic or OpenAi you can simple install multi-agent-orchestrator using: pip install multi-agent-orchestrator this is available from version 0.1.1 This will save you a bit of init time.
  • You can also check the snapstart for python. Which has been released last week.

Let us know if you need further assistance. regards, Anthony

brnaba-aws avatar Dec 03 '24 15:12 brnaba-aws

@karanakatle any updates on this?

brnaba-aws avatar Dec 09 '24 08:12 brnaba-aws

@karanakatle , I'm about to close this since I didn't hear anything from you. Let me know if you need further assistance.

brnaba-aws avatar Dec 12 '24 17:12 brnaba-aws

Hello @brnaba-aws Sorry for delay response, I tried enabling snapstart and installing the required dependencies only but the issue persists. On detail analysis, this is what I have found.

  1. Every time the library import takes 2 seconds.
  2. We have used a custom classifier - where we are using our fine tune model for getting a response - it provides a response in 0.6 to 1 sec of time.
  3. If no agent is selected - it goes to fallback agent - which is bedrock agent which again takes 2-3 sec. of time

Is there any chance or way to save the time consuming in 1st and 3rd point

Code is attached for reference. multi_agent_orchestrator.zip

Cloudwatch logs also attached for analysis Bot Orchestration Logs.txt

karanashokraokatle avatar Dec 13 '24 09:12 karanashokraokatle

For:

  1. can you provide a log for more than a single invocation? To see if this time is always there or only on the very first invocation.
  2. You can't really improve that. Unless you don't use a default agent by setting: USE_DEFAULT_AGENT_IF_NONE_IDENTIFIED=False, Or use another model for the default agent to be a fast one like claude 3 haiku. I see that your bedrock_llm_agent is using Claude Sonnet 3.5, which is slow.

brnaba-aws avatar Dec 13 '24 10:12 brnaba-aws

@karanakatle , one more thing. I don't think you are using the latest python version since I don't see this try/except with Anthropic: new import method

One thing that I just noticed is the fact that each Lex bot will instantiate a boto3.client('lexv2-runtime', region_name=self.region). We haven't provided a way to reuse a client passed as a parameter. This would help I believe. I'll create an issue, and provide you with a file to test? ok?

brnaba-aws avatar Dec 13 '24 10:12 brnaba-aws

Sure @brnaba-aws , thnx for the help.

karanashokraokatle avatar Dec 13 '24 11:12 karanashokraokatle

@karanashokraokatle,

could you please try to use this LexBotAgent? It can accept a client as an option, so you can create a single client and use it across all your lex bot.

example:

lex_client = boto3.client('lexv2-runtime', region_name=os.getenv('AWS_REGION','us-east-1'))

my_agent = LexBotAgent(LexBotAgentOptions(client=lex_client, bot_id = '', bot_alias_id='', locale_id=''))
my_agent_2 = LexBotAgent(LexBotAgentOptions(client=lex_client, bot_id = '', bot_alias_id='', locale_id=''))
from typing import List, Dict, Optional
from dataclasses import dataclass
import boto3
from botocore.exceptions import BotoCoreError, ClientError
from multi_agent_orchestrator.agents import Agent, AgentOptions
from multi_agent_orchestrator.types import ConversationMessage, ParticipantRole
from multi_agent_orchestrator.utils import Logger
import os
from typing import Any

@dataclass
class LexBotAgentOptions(AgentOptions):
    bot_id: str = None
    bot_alias_id: str = None
    locale_id: str = None
    client: Optional[Any] = None

class LexBotAgent(Agent):
    def __init__(self, options: LexBotAgentOptions):
        super().__init__(options)
        if (options.region is None):
            self.region = os.environ.get("AWS_REGION", 'us-east-1')
        else:
            self.region = options.region

        if options.client:
            self.lex_client = options.client

        else:
            self.lex_client = boto3.client('lexv2-runtime', region_name=self.region)

        self.bot_id = options.bot_id
        self.bot_alias_id = options.bot_alias_id
        self.locale_id = options.locale_id

        if not all([self.bot_id, self.bot_alias_id, self.locale_id]):
            raise ValueError("bot_id, bot_alias_id, and locale_id are required for LexBotAgent")

    async def process_request(self, input_text: str, user_id: str, session_id: str,
                        chat_history: List[ConversationMessage],
                        additional_params: Optional[Dict[str, str]] = None) -> ConversationMessage:
        try:
            params = {
                'botId': self.bot_id,
                'botAliasId': self.bot_alias_id,
                'localeId': self.locale_id,
                'sessionId': session_id,
                'text': input_text,
                'sessionState': {}  # You might want to maintain session state if needed
            }

            response = self.lex_client.recognize_text(**params)

            concatenated_content = ' '.join(
                message.get('content', '') for message in response.get('messages', [])
                if message.get('content')
            )

            return ConversationMessage(
                role=ParticipantRole.ASSISTANT.value,
                content=[{"text": concatenated_content or "No response from Lex bot."}]
            )

        except (BotoCoreError, ClientError) as error:
            Logger.error(f"Error processing request: {str(error)}")
            raise error

brnaba-aws avatar Dec 13 '24 13:12 brnaba-aws

@karanakatle , We haven't heard from you in a while. Can we close this?

brnaba-aws avatar Jan 16 '25 09:01 brnaba-aws

Closing this issue due to inactivity. Please feel free to reopen if you'd like to continue the discussion.

cornelcroi avatar Jan 18 '25 15:01 cornelcroi