ai models support cross-region inference profile
Environment information
System:
OS: Linux 6.8 Ubuntu 22.04.5 LTS 22.04.5 LTS (Jammy Jellyfish)
CPU: (8) x64 Intel(R) Xeon(R) Platinum 8488C
Memory: 23.05 GB / 30.82 GB
Shell: /usr/bin/zsh
Binaries:
Node: 20.18.0 - ~/.nvm/versions/node/v20.18.0/bin/node
Yarn: 1.22.19 - ~/.linuxbrew/homebrew/bin/yarn
npm: 10.8.2 - ~/.nvm/versions/node/v20.18.0/bin/npm
pnpm: 9.6.0 - ~/.nvm/versions/node/v20.18.0/bin/pnpm
NPM Packages:
@aws-amplify/auth-construct: 1.5.0
@aws-amplify/backend: 1.8.0
@aws-amplify/backend-auth: 1.4.1
@aws-amplify/backend-cli: 1.4.2
@aws-amplify/backend-data: 1.2.1
@aws-amplify/backend-deployer: 1.1.9
@aws-amplify/backend-function: 1.8.0
@aws-amplify/backend-output-schemas: 1.4.0
@aws-amplify/backend-output-storage: 1.1.3
@aws-amplify/backend-secret: 1.1.5
@aws-amplify/backend-storage: 1.2.3
@aws-amplify/cli-core: 1.2.0
@aws-amplify/client-config: 1.5.2
@aws-amplify/deployed-backend-client: 1.4.2
@aws-amplify/form-generator: 1.0.3
@aws-amplify/model-generator: 1.0.9
@aws-amplify/platform-core: 1.2.2
@aws-amplify/plugin-types: 1.5.0
@aws-amplify/sandbox: 1.2.5
@aws-amplify/schema-generator: 1.2.5
aws-amplify: 6.10.2
aws-cdk: 2.167.1
aws-cdk-lib: 2.167.1
typescript: 5.6.3
No AWS environment variables
No CDK environment variables
Describe the feature
Cross-region inference enhances the resilience of Bedrock invocation.
https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html
Use case
invoke the LLM models of Bedrock which support cross-region inference
Hey,👋 thanks for raising this! I'm going to transfer this over to our API repository for better assistance 🙂 related to https://github.com/aws-amplify/docs/issues/8121#issuecomment-2494375015 providing an example
Hey @zxkane, thanks for raising the issue! Regarding the cross-region inference profile, please take a look at the followup of a similar issue: AI kit does not support Cross-region inference #8121. It provides an example of how to implement within Amplify backend. Feel free to ask if you have more concerns. Thanks.
Thanks for sharing the workaround, it's not trivial to grant additional IAM permissions to the role. Also, it requires additional logic to support environment-agnostic deployment in multiple regions.
So it would be useful as a built-in feature to mitigate the capacity limitation of those models.
Hey @zxkane. Thanks for sharing feedbacks and concerns. We will surely have your concerns discussed and reviewed, and evaluated whether this should be categorized as a feature request. Meanwhile, will have the topic updated here once the next step is confirmed.
it's not trivial to grant additional IAM permissions to the role.
Agreed, it also requires figuring out which IAM permissions are needed in the Amazon Bedrock documentation. We want to offer a more seamless experience for setting up cross-region inference with AI kit.
Thanks for the feature request! We'll update this issue with any news.
Below is a code snippet of how to hack both generation and conversation with cross-region inference. But the stack name and resource name are related to my app,
function createBedrockPolicyStatement(currentRegion: string, accountId: string, modelId: string, crossRegionModel: string) {
return new PolicyStatement({
resources: [
`arn:aws:bedrock:*::foundation-model/${modelId}`,
`arn:aws:bedrock:${currentRegion}:${accountId}:inference-profile/${crossRegionModel}`,
],
actions: ['bedrock:InvokeModel*'],
});
}
if (CROSS_REGION_INFERENCE && CUSTOM_MODEL_ID) {
const currentRegion = getCurrentRegion(backend.stack);
const crossRegionModel = getCrossRegionModelId(currentRegion, CUSTOM_MODEL_ID);
// [chat converstation]
const chatStack = backend.data.resources.nestedStacks?.['ChatConversationDirectiveLambdaStack'];
if (chatStack) {
const conversationFunc = chatStack.node.findAll()
.find(child => child.node.id === 'conversationHandlerFunction') as IFunction;
if (conversationFunc) {
conversationFunc.addToRolePolicy(
createBedrockPolicyStatement(currentRegion, backend.stack.account, CUSTOM_MODEL_ID, crossRegionModel)
);
}
}
// [insights generation]
const insightsStack = backend.data.resources.nestedStacks?.['GenerationBedrockDataSourceGenerateInsightsStack'];
if (insightsStack) {
const dataSourceRole = insightsStack.node.findChild('GenerationBedrockDataSourceGenerateInsightsIAMRole') as IRole;
if (dataSourceRole) {
dataSourceRole.attachInlinePolicy(
new Policy(insightsStack, 'CrossRegionInferencePolicy', {
statements: [
createBedrockPolicyStatement(currentRegion, backend.stack.account, CUSTOM_MODEL_ID, crossRegionModel)
],
}),
);
}
}
}
I don't think it's an easy thing for the full-stack developers without strong CDK knowledge.
Based on @zxkane answer and the https://docs.amplify.aws/react/build-a-backend/functions/examples/dynamo-db-stream/ example. I have come up with a very simple approach, it would be nice if you could add something around this to the docs @Siqi-Shan or @chrisbonifacio as there isn't clear around the cross region support and I had no idea what was going on.
const model = "amazon.nova-micro-v1:0"
// Must match model on aiModel resourcePath
const crossRegionModel = "apac.amazon.nova-micro-v1:0"
const backend = defineBackend({
auth,
data,
myDynamoDBFunction,
});
const conversationStack =
backend.data.resources.nestedStacks?.[
"ChatConversationDirectiveLambdaStack"
];
const conversationFunc = conversationStack?.node
.findAll()
.find(
(child) => child.node.id === "conversationHandlerFunction"
) as IFunction;
conversationFunc.addToRolePolicy(
new PolicyStatement({
resources: [
`arn:aws:bedrock:*::foundation-model/${model}`,
`arn:aws:bedrock:${currentRegion}:${accountId}:inference-profile/${crossRegionModel}`,
],
actions: ["bedrock:InvokeModel*"],
}),
);
I also make a helper function to do this easier for my future use.
import { IFunction } from "aws-cdk-lib/aws-lambda";
import { CrossRegion, Model } from "../../types/model";
import { Stack } from "aws-cdk-lib";
import { PolicyStatement } from "aws-cdk-lib/aws-iam";
type AddBedrockPolicyToFunctionProps = {
stack: Stack;
handlerFunctionId: string;
model: Model;
crossRegion: CrossRegion;
accountId: string;
currentRegion: string;
};
export const addCrossRegionBedrockPolicyToFunction = ({
stack,
handlerFunctionId,
model,
crossRegion,
accountId,
currentRegion,
}: AddBedrockPolicyToFunctionProps) => {
const handlerFunc = stack?.node
.findAll()
.find(
(child) => child.node.id === handlerFunctionId
) as IFunction;
handlerFunc.addToRolePolicy(
new PolicyStatement({
resources: [
`arn:aws:bedrock:*::foundation-model/${model}`,
`arn:aws:bedrock:${currentRegion}:${accountId}:inference-profile/${crossRegion}.${model}`,
],
actions: ["bedrock:InvokeModel*"],
})
);
};