generative-ai-cdk-constructs icon indicating copy to clipboard operation
generative-ai-cdk-constructs copied to clipboard

(bedrock): New alias created every deployment

Open jkhask opened this issue 10 months ago • 6 comments

Describe the bug

When using the CDK constructs to deploy an alias, the previous existing alias is deleted, and a new one is created. This means anything downstream that needs the alias ID also needs to be updated.

Expected Behavior

The alias ID remains the same and the new agent version is associated to it.

Current Behavior

The old alias is completely deleted and and a new one with a different id is created and associated with the latest agent version.

Reproduction Steps

  1. Deploy an agent and alias for that agent. Note the aliad Id.
  2. Redeploy the stack. See the alias Id has changed.

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.178.2 (build 89c49cc)

Framework Version

0.1.293

Node.js Version

v22.9.0

OS

linux

Language

Typescript

Language Version

No response

Region experiencing the issue

us-east-1

Code modification

no

Other information

No response

Service quota

  • [x] I have reviewed the service quotas for this construct

jkhask avatar Feb 18 '25 16:02 jkhask

Hi @jkhask , thank you for reporting this issue. I am able to reproduce this behavior.

I used the following code:

const agent = new bedrock.Agent(this, 'Agent', {
      foundationModel: bedrock.BedrockFoundationModel.ANTHROPIC_CLAUDE_3_5_SONNET_V1_0,
      instruction: 'You are a helpful and friendly agent that answers questions about literature.',
      userInputEnabled: true,
      shouldPrepareAgent:true
    });

    const agentAlias = new bedrock.AgentAlias(this, 'myalias', {
      agent: agent,
      description: 'alias for my agent'
    });

    new cdk.CfnOutput(this, 'AliasId', {value: agentAlias.aliasId});

If the alias name property is not provided, the construct creates the name as follows:

const hash = md5hash(props.agent.agentId + props.agentVersion + props.agent.lastUpdated);
this.aliasName = `latest-${hash}`;

When deploying an agent with an alias, the new agent is prepared and the 'lastUpdated' is updated, which triggers a new name for the alias which in turns generates a new id.

I think the last updated prop is also not used correctly in the id of the agent alias construct (see here) since if you provide a name, deploy, then re-deploy after some time, the last updated value is different and it changes the id of the construct which leads to a failure in deployment.

krokoko avatar Feb 18 '25 22:02 krokoko

I am experiencing the same issue, would be great if this can be fixed!

maxritter avatar Mar 05 '25 09:03 maxritter

Same here. This is particularly frustrating given it used to work before the recent breaking changes to Agent and AgentAlias constructs.

le-dudu avatar Mar 07 '25 04:03 le-dudu

I have now ended up removing the alias from CDK and running a script in my pipeline that automates this process. In case anybody else needs it, here it is:

#!/bin/bash

# This script updates all agent aliases to point to their latest versions

# Get the environment from parameter or use 'poc' as default
ENV=${1:-poc}
REGION="us-east-1"

# Agent configurations - add new agents here in config <AGENT_ID>|<AGENT_ALIAS_ID>|<ALIAS_NAME>
declare -A AGENTS=(
    ["AGENT_1"]="<AGENT_1_ID>|<AGENT_1_ALIAS_ID>|<ALIAS_NAME_1>"
    ["AGENT_2"]="<AGENT_2_ID>|<AGENT_2_ALIAS_ID>|<ALIAS_NAME_2>"
    ["AGENT_3"]="<AGENT_3_ID>|<AGENT_3_ALIAS_ID>|<ALIAS_NAME_3>"
)

echo "Updating agent aliases for environment: $ENV in region: $REGION..."

create_agent_version() {
    local agent_id=$1
    local agent_name=$2

    echo "Creating new version for $agent_name..." >&2

    versions_json=$(aws bedrock-agent list-agent-versions \
        --agent-id "$agent_id" \
        --region "$REGION" \
        --output json)

    highest_version=$(echo "$versions_json" | grep -o '"agentVersion": "[0-9]*"' | grep -o '[0-9]*' | sort -nr | head -1)
    if [ -z "$highest_version" ]; then
        highest_version=0
    fi
    next_version=$((highest_version + 1))
    echo "Next version will be: $next_version" >&2

    temp_alias_name="temp-alias-$(date +%s)"

    echo "Creating temporary alias $temp_alias_name to create version $next_version..." >&2
    response=$(aws bedrock-agent create-agent-alias \
        --agent-id "$agent_id" \
        --agent-alias-name "$temp_alias_name" \
        --region "$REGION" \
        --output json 2>&1)

    if [ $? -ne 0 ]; then
        echo "Error creating new version for $agent_name: $response" >&2
        return 1
    fi

    version="$next_version"

    echo "Successfully created version $version for $agent_name" >&2

    alias_id=$(echo "$response" | grep -o '"agentAliasId": "[^"]*"' | cut -d'"' -f4)
    if [ -n "$alias_id" ]; then
        echo "Cleaning up temporary alias $alias_id..." >&2
        aws bedrock-agent delete-agent-alias \
            --agent-id "$agent_id" \
            --agent-alias-id "$alias_id" \
            --region "$REGION" \
            --output json >/dev/null 2>&1
    else
        echo "Warning: Could not extract alias ID for cleanup, but continuing anyway" >&2
    fi

    echo "$version"
}

update_agent_alias() {
    local agent_id=$1
    local agent_alias_id=$2
    local agent_alias_name=$3
    local version=$4
    local agent_name=$5

    echo "Updating $agent_name alias..."
    aws bedrock-agent update-agent-alias \
        --agent-id "$agent_id" \
        --agent-alias-id "$agent_alias_id" \
        --agent-alias-name "$agent_alias_name" \
        --routing-configuration "[{\"agentVersion\":\"$version\"}]" \
        --region "$REGION" \
        --output json >/dev/null

    if [ $? -ne 0 ]; then
        echo "Failed to update $agent_name alias"
        return 1
    fi
    return 0
}

declare -A VERSIONS=()
for agent_name in "${!AGENTS[@]}"; do
    IFS="|" read -r agent_id agent_alias_id agent_alias_name <<<"${AGENTS[$agent_name]}"

    version=$(create_agent_version "$agent_id" "$agent_name")
    if [ -z "$version" ]; then
        echo "Failed to create version for $agent_name. Exiting."
        exit 1
    fi

    VERSIONS["$agent_name"]="$version"
    echo "New $agent_name version: $version"
done

for agent_name in "${!AGENTS[@]}"; do
    IFS="|" read -r agent_id agent_alias_id agent_alias_name <<<"${AGENTS[$agent_name]}"
    version="${VERSIONS[$agent_name]}"

    if ! update_agent_alias "$agent_id" "$agent_alias_id" "$agent_alias_name" "$version" "$agent_name"; then
        exit 1
    fi
done

echo "All agent aliases updated successfully!"

maxritter avatar Mar 09 '25 16:03 maxritter

There seems to be an issue in the hash calculation since it uses values which are potentially tokens:

const hash = md5hash(props.agent.agentId + props.agentVersion + props.agent.lastUpdated);

Printing out these values e.g when doing synth / deploy

console.log(props.agent.agentId, props.agentVersion, props.agent.lastUpdated);

outputs

${Token[TOKEN.1166]} undefined ${Token[TOKEN.1173]}

Now if this used for hash calculation, it's not going to provide stable hash since token IDs change. I noticed this while doing snapshot tests and the name / id was changing on each test run even though the implementation was kept the same.

In order to fix this, I think the hash should be defined over the concrete property values which define an agent. In core aws-cdk-lib there's plenty of examples of using hash, e.g. https://github.com/aws/aws-cdk/blob/main/packages/aws-cdk-lib/aws-apigateway/lib/deployment.ts#L187-L188. However, using similar approach for Agent may get quite complex due to the complex/nested configuration, but in any case most of the reliable hashes rely on to resolving the values in the context of Stack and calculating the hash over the resolved values.

I ended up fixing this in my project by using L1 Construct and calculating the Hash over the whole resolved CloudFormation Agent definition (below). However, this may account for useless properties in the Agent, and vice versa if some properties outside the Agent should be accounted for hash they are missed.

// utils.ts
export const getAgentHash = (agent: Agent): string => {
  const resource = agent.node.findAll().find(construct => construct instanceof CfnAgent)
  if (!resource) {
    throw new Error('Could not find L1 Bedrock Agent construct')
  }
  const rendered = JSON.stringify(Stack.of(agent).resolve(resource._toCloudFormation()))
  return createHash('md5').update(rendered).digest('hex')
}

// AgentStack.ts
//...
const agentHash = getAgentHash(agent)

const alias = new CfnAgentAlias(this, 'AgentAlias', {
  agentAliasName: `latest-${agentHash}`,
  agentId: agent.agentId,
})
//...

Happy to make PR if this is something that could be considered as solution.

Unrelated to this specific issue: defining the hash in the Logical ID seemed bit useless based on quick tests. Even without it, the Alias was updated to new version successfully. Not sure if there's some other use case why it's there currently.

leevilehtonen avatar Mar 18 '25 19:03 leevilehtonen

I wanted to add some more info for this. When leveraging AgentCollaborations this becomes very problematic since the deployment will recreate the agent alias. However if the old agent alias is still associated with an agent collaboration, it cannot be deleted. I'm seeing this being reproduced 100% of the time when used in a collaboration.

shawnaws avatar Mar 21 '25 19:03 shawnaws