amplify-category-api icon indicating copy to clipboard operation
amplify-category-api copied to clipboard

ai models support cross-region inference profile

Open zxkane opened this issue 1 year ago • 6 comments

Environment information

System:
  OS: Linux 6.8 Ubuntu 22.04.5 LTS 22.04.5 LTS (Jammy Jellyfish)
  CPU: (8) x64 Intel(R) Xeon(R) Platinum 8488C
  Memory: 23.05 GB / 30.82 GB
  Shell: /usr/bin/zsh
Binaries:
  Node: 20.18.0 - ~/.nvm/versions/node/v20.18.0/bin/node
  Yarn: 1.22.19 - ~/.linuxbrew/homebrew/bin/yarn
  npm: 10.8.2 - ~/.nvm/versions/node/v20.18.0/bin/npm
  pnpm: 9.6.0 - ~/.nvm/versions/node/v20.18.0/bin/pnpm
NPM Packages:
  @aws-amplify/auth-construct: 1.5.0
  @aws-amplify/backend: 1.8.0
  @aws-amplify/backend-auth: 1.4.1
  @aws-amplify/backend-cli: 1.4.2
  @aws-amplify/backend-data: 1.2.1
  @aws-amplify/backend-deployer: 1.1.9
  @aws-amplify/backend-function: 1.8.0
  @aws-amplify/backend-output-schemas: 1.4.0
  @aws-amplify/backend-output-storage: 1.1.3
  @aws-amplify/backend-secret: 1.1.5
  @aws-amplify/backend-storage: 1.2.3
  @aws-amplify/cli-core: 1.2.0
  @aws-amplify/client-config: 1.5.2
  @aws-amplify/deployed-backend-client: 1.4.2
  @aws-amplify/form-generator: 1.0.3
  @aws-amplify/model-generator: 1.0.9
  @aws-amplify/platform-core: 1.2.2
  @aws-amplify/plugin-types: 1.5.0
  @aws-amplify/sandbox: 1.2.5
  @aws-amplify/schema-generator: 1.2.5
  aws-amplify: 6.10.2
  aws-cdk: 2.167.1
  aws-cdk-lib: 2.167.1
  typescript: 5.6.3
No AWS environment variables
No CDK environment variables

Describe the feature

Cross-region inference enhances the resilience of Bedrock invocation.

https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html

Use case

invoke the LLM models of Bedrock which support cross-region inference

zxkane avatar Dec 15 '24 07:12 zxkane

Hey,👋 thanks for raising this! I'm going to transfer this over to our API repository for better assistance 🙂 related to https://github.com/aws-amplify/docs/issues/8121#issuecomment-2494375015 providing an example

ykethan avatar Dec 16 '24 15:12 ykethan

Hey @zxkane, thanks for raising the issue! Regarding the cross-region inference profile, please take a look at the followup of a similar issue: AI kit does not support Cross-region inference #8121. It provides an example of how to implement within Amplify backend. Feel free to ask if you have more concerns. Thanks.

Siqi-Shan avatar Dec 16 '24 22:12 Siqi-Shan

Thanks for sharing the workaround, it's not trivial to grant additional IAM permissions to the role. Also, it requires additional logic to support environment-agnostic deployment in multiple regions.

So it would be useful as a built-in feature to mitigate the capacity limitation of those models.

zxkane avatar Dec 17 '24 02:12 zxkane

Hey @zxkane. Thanks for sharing feedbacks and concerns. We will surely have your concerns discussed and reviewed, and evaluated whether this should be categorized as a feature request. Meanwhile, will have the topic updated here once the next step is confirmed.

Siqi-Shan avatar Dec 17 '24 04:12 Siqi-Shan

it's not trivial to grant additional IAM permissions to the role.

Agreed, it also requires figuring out which IAM permissions are needed in the Amazon Bedrock documentation. We want to offer a more seamless experience for setting up cross-region inference with AI kit.

Thanks for the feature request! We'll update this issue with any news.

atierian avatar Dec 19 '24 19:12 atierian

Below is a code snippet of how to hack both generation and conversation with cross-region inference. But the stack name and resource name are related to my app,


function createBedrockPolicyStatement(currentRegion: string, accountId: string, modelId: string, crossRegionModel: string) {
  return new PolicyStatement({
    resources: [
      `arn:aws:bedrock:*::foundation-model/${modelId}`,
      `arn:aws:bedrock:${currentRegion}:${accountId}:inference-profile/${crossRegionModel}`,
    ],
    actions: ['bedrock:InvokeModel*'],
  });
}

if (CROSS_REGION_INFERENCE && CUSTOM_MODEL_ID) {
  const currentRegion = getCurrentRegion(backend.stack);
  const crossRegionModel = getCrossRegionModelId(currentRegion, CUSTOM_MODEL_ID);
  
  // [chat converstation]
  const chatStack = backend.data.resources.nestedStacks?.['ChatConversationDirectiveLambdaStack'];
  if (chatStack) {
    const conversationFunc = chatStack.node.findAll()
      .find(child => child.node.id === 'conversationHandlerFunction') as IFunction;

    if (conversationFunc) {
      conversationFunc.addToRolePolicy(
        createBedrockPolicyStatement(currentRegion, backend.stack.account, CUSTOM_MODEL_ID, crossRegionModel)
      );
    }
  }

  // [insights generation]
  const insightsStack = backend.data.resources.nestedStacks?.['GenerationBedrockDataSourceGenerateInsightsStack'];
  if (insightsStack) {
    const dataSourceRole = insightsStack.node.findChild('GenerationBedrockDataSourceGenerateInsightsIAMRole') as IRole;
    if (dataSourceRole) {
      dataSourceRole.attachInlinePolicy(
        new Policy(insightsStack, 'CrossRegionInferencePolicy', {
          statements: [
            createBedrockPolicyStatement(currentRegion, backend.stack.account, CUSTOM_MODEL_ID, crossRegionModel)
          ],
        }),
      );
    }
  }
}

I don't think it's an easy thing for the full-stack developers without strong CDK knowledge.

zxkane avatar Dec 20 '24 03:12 zxkane

Based on @zxkane answer and the https://docs.amplify.aws/react/build-a-backend/functions/examples/dynamo-db-stream/ example. I have come up with a very simple approach, it would be nice if you could add something around this to the docs @Siqi-Shan or @chrisbonifacio as there isn't clear around the cross region support and I had no idea what was going on.

const model = "amazon.nova-micro-v1:0"
// Must match model on aiModel resourcePath
const crossRegionModel = "apac.amazon.nova-micro-v1:0"

const backend = defineBackend({
    auth,
    data,
    myDynamoDBFunction,
});

const conversationStack =
     backend.data.resources.nestedStacks?.[
     "ChatConversationDirectiveLambdaStack"
];

const conversationFunc = conversationStack?.node
    .findAll()
    .find(
      (child) => child.node.id === "conversationHandlerFunction"
) as IFunction;

conversationFunc.addToRolePolicy(
    new PolicyStatement({
          resources: [
            `arn:aws:bedrock:*::foundation-model/${model}`,
            `arn:aws:bedrock:${currentRegion}:${accountId}:inference-profile/${crossRegionModel}`,
          ],
          actions: ["bedrock:InvokeModel*"],
    }),
);

I also make a helper function to do this easier for my future use.

import { IFunction } from "aws-cdk-lib/aws-lambda";
import { CrossRegion, Model } from "../../types/model";
import { Stack } from "aws-cdk-lib";
import { PolicyStatement } from "aws-cdk-lib/aws-iam";

type AddBedrockPolicyToFunctionProps = {
  stack: Stack;
  handlerFunctionId: string;
  model: Model;
  crossRegion: CrossRegion;
  accountId: string;
  currentRegion: string;
};

export const addCrossRegionBedrockPolicyToFunction = ({
  stack,
  handlerFunctionId,
  model,
  crossRegion,
  accountId,
  currentRegion,
}: AddBedrockPolicyToFunctionProps) => {
  const handlerFunc = stack?.node
    .findAll()
    .find(
      (child) => child.node.id === handlerFunctionId
    ) as IFunction;

  handlerFunc.addToRolePolicy(
    new PolicyStatement({
      resources: [
        `arn:aws:bedrock:*::foundation-model/${model}`,
        `arn:aws:bedrock:${currentRegion}:${accountId}:inference-profile/${crossRegion}.${model}`,
      ],
      actions: ["bedrock:InvokeModel*"],
    })
  );
};

Alexanderdunlop avatar Jun 28 '25 12:06 Alexanderdunlop