docs
docs copied to clipboard
AI kit does not support Cross-region inference
Environment information
System:
OS: macOS 14.6.1
CPU: (16) x64 Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
Memory: 94.01 MB / 16.00 GB
Shell: /bin/zsh
Binaries:
Node: 22.9.0 - /usr/local/bin/node
Yarn: undefined - undefined
npm: 10.8.3 - /usr/local/bin/npm
pnpm: undefined - undefined
NPM Packages:
@aws-amplify/auth-construct: 1.5.0
@aws-amplify/backend: 1.8.0
@aws-amplify/backend-auth: 1.4.1
@aws-amplify/backend-cli: 1.4.2
@aws-amplify/backend-data: 1.2.1
@aws-amplify/backend-deployer: 1.1.9
@aws-amplify/backend-function: 1.8.0
@aws-amplify/backend-output-schemas: 1.4.0
@aws-amplify/backend-output-storage: 1.1.3
@aws-amplify/backend-secret: 1.1.5
@aws-amplify/backend-storage: 1.2.3
@aws-amplify/cli-core: 1.2.0
@aws-amplify/client-config: 1.5.2
@aws-amplify/deployed-backend-client: 1.4.2
@aws-amplify/form-generator: 1.0.3
@aws-amplify/model-generator: 1.0.9
@aws-amplify/platform-core: 1.2.1
@aws-amplify/plugin-types: 1.5.0
@aws-amplify/sandbox: 1.2.6
@aws-amplify/schema-generator: 1.2.5
aws-amplify: 6.9.0
aws-cdk: 2.169.0
aws-cdk-lib: 2.169.0
typescript: 5.6.3
No AWS environment variables
No CDK environment variables
Describe the bug
In the schema I can only define model like
const schema = a.schema({
chat: a
.conversation({
aiModel: a.ai.model("Claude 3.5 Sonnet"),
systemPrompt: `You are a very helpful assistant`,
})
.authorization((allow) => allow.owner()),
});
But i get an error in my region because it only allows access to Claude 3.5 via a inference profile. This is the error in de lambda:
{
"timestamp": "2024-11-21T22:03:13.695Z",
"level": "ERROR",
"requestId": "37377ab9-f496-4cc3-b5a1-70115062ea0f",
"message": "Failed to handle conversation turn event, currentMessageId=2ef0c357-e7ab-45b6-92be-71b855c65597, conversationId=411d1032-0704-4384-8952-5db49f27e5b1 ValidationException: Invocation of model ID anthropic.claude-3-5-sonnet-20240620-v1:0 with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model.\n at de_ValidationExceptionRes (/var/runtime/node_modules/@aws-sdk/client-bedrock-runtime/dist-cjs/index.js:1195:21)\n at de_CommandError (/var/runtime/node_modules/@aws-sdk/client-bedrock-runtime/dist-cjs/index.js:1028:19)\n at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/middleware-serde/dist-cjs/index.js:35:20\n at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/core/dist-cjs/index.js:165:18\n at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/middleware-retry/dist-cjs/index.js:320:38\n at async /var/runtime/node_modules/@aws-sdk/middleware-logger/dist-cjs/index.js:34:22\n at async BedrockConverseAdapter.askBedrockStreaming (/var/task/index.js:813:29)\n at async ConversationTurnExecutor.execute (/var/task/index.js:1009:32)\n at async Runtime.handleConversationTurnEvent [as handler] (/var/task/index.js:1043:7) {\n '$fault': 'client',\n '$metadata': {\n httpStatusCode: 400,\n requestId: 'dbb16273-798b-4bad-946b-ac30835b2c0f',\n extendedRequestId: undefined,\n cfId: undefined,\n attempts: 1,\n totalRetryDelay: 0\n }\n}",
"errorType": "ValidationException",
"errorMessage": "Invocation of model ID anthropic.claude-3-5-sonnet-20240620-v1:0 with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model.",
"stackTrace": [
"ValidationException: Invocation of model ID anthropic.claude-3-5-sonnet-20240620-v1:0 with on-demand throughput isn’t supported. Retry your request with the ID or ARN of an inference profile that contains this model.",
" at de_ValidationExceptionRes (/var/runtime/node_modules/@aws-sdk/client-bedrock-runtime/dist-cjs/index.js:1195:21)",
" at de_CommandError (/var/runtime/node_modules/@aws-sdk/client-bedrock-runtime/dist-cjs/index.js:1028:19)",
" at process.processTicksAndRejections (node:internal/process/task_queues:95:5)",
" at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/middleware-serde/dist-cjs/index.js:35:20",
" at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/core/dist-cjs/index.js:165:18",
" at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/middleware-retry/dist-cjs/index.js:320:38",
" at async /var/runtime/node_modules/@aws-sdk/middleware-logger/dist-cjs/index.js:34:22",
" at async BedrockConverseAdapter.askBedrockStreaming (/var/task/index.js:813:29)",
" at async ConversationTurnExecutor.execute (/var/task/index.js:1009:32)",
" at async Runtime.handleConversationTurnEvent [as handler] (/var/task/index.js:1043:7)"
]
}
Reproduction steps
Watched the cloudlogs errors, because I didn't get a response in the front end.
It's unclear in the docs, but you can solve it by putting the inference profile id in the resourcepath:
const schema = a.schema({
chat: a
.conversation({
aiModel: {
resourcePath: "eu.anthropic.claude-3-5-sonnet-20240620-v1:0",
},
systemPrompt: `You are a very helpful assistant`,
})
.authorization((allow) => allow.owner()),
});
but you also need to update your lambda (that is invoking bedrock) resource policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream"
],
"Resource": [
"arn:aws:bedrock:*::foundation-model/anthropic.claude-3-5-sonnet-20240620-v1:0",
"arn:aws:bedrock:*:114244416074:inference-profile/anthropic.claude-3-5-sonnet-20240620-v1:0",
"arn:aws:bedrock:*:114244416074:inference-profile/eu.anthropic.claude-3-5-sonnet-20240620-v1:0"
],
"Effect": "Allow"
}
]
}
Would be great if this can be adjusted
Hey, thanks for the feedback and information on how this can be solved. Transferring the issue to the documentation repository for updates.
Thanks for opening this @rpostulart. We'll add an example to the docs. In the meantime, here's an example of how you can do this directly within your Amplify backend.
Add a custom conversation handler
Add the @aws-amplify/backend-ai package
npm install @aws-amplify/backend-ai
In amplify/data/resource.ts
import { defineConversationHandlerFunction } from "@aws-amplify/backend-ai/conversation";
export const crossRegionModel = `eu.${model}`;
export const model = 'anthropic.claude-3-5-sonnet-20240620-v1:0';
export const conversationHandler = defineConversationHandlerFunction({
entry: "./conversationHandler.ts",
name: "conversationHandler",
models: [{ modelId: crossRegionModel }],
});
const schema = a.schema({
chat: a.conversation({
aiModel: {
resourcePath: crossRegionModel,
},
systemPrompt: 'You are a helpful assistant.',
handler: conversationHandler,
)}
.authorization((allow) => allow.owner())
});
Create a new file amplify/data/conversationHandler.ts
import { handleConversationTurnEvent } from '@aws-amplify/backend-ai/conversation/runtime';
export const handler = handleConversationTurnEvent;
In amplify/backend.ts
import { defineBackend } from "@aws-amplify/backend";
import { auth } from "./auth/resource";
import { data, conversationHandler, crossRegionModel, model } from "./data/resource";
import { PolicyStatement } from "aws-cdk-lib/aws-iam";
const backend = defineBackend({
auth,
data,
conversationHandler,
});
// This policy statement assumes that you're deploying in `eu-west-1`.
// If that's not the case, adjust the resources block in the policy statements accordingly.
backend.conversationHandler.resources.lambda.addToRolePolicy(
new PolicyStatement({
resources: [
`arn:aws:bedrock:eu-west-1:[account-number]:inference-profile/${crossRegionModel}`,
`arn:aws:bedrock:eu-west-1::foundation-model/${model}`,
`arn:aws:bedrock:eu-west-3::foundation-model/${model}`,
`arn:aws:bedrock:eu-central-1::foundation-model/${model}`,
],
actions: [
'bedrock:InvokeModelWithResponseStream'
],
})
);
ok this is great. I will close the issue with your commitment you will update the docs :)
Is there a similar work-around for generations and Summarization vs conversations Seems to be working for conversations.