generative-ai-cdk-constructs
generative-ai-cdk-constructs copied to clipboard
(bedrock): add inference profiles / cross-region inference
Describe the feature
An inference profile in the context of Amazon Bedrock is a configuration that allows you to route model inference traffic to multiple AWS Regions for increased availability and throughput.
https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference-support.html
Use Case
Add inference profiles for improved availability and to access certain models which are only available in certain regions through the usage of inference profiles.
Proposed Solution
Define BedrockInferenceProfile
as follows:
export enum InferenceProfileRegion {
/**
* EU: Frankfurt (eu-central-1), Ireland (eu-west-1), Paris (eu-west-3)
*/
EU = 'eu',
/**
* US: N. Virginia (us-east-1), Oregon (us-west-2)
*/
US = 'us',
}
export interface InferenceProfileProps {
/**
* The geo region where the traffic is going to be distributed.
*/
readonly region: InferenceProfileRegion;
/**
* A model supporting cross-region inference.
* @see https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference-support.html
*/
readonly model: BedrockFoundationModel;
}
export class BedrockInferenceProfile extends Resource {
/**
* @example 'us.anthropic.claude-3-5-sonnet-20240620-v1:0'
*/
public readonly profileId: string;
/**
* @example 'arn:aws:bedrock:us-east-1:123456789012:inference-profile/us.anthropic.claude-3-5-sonnet-20240620-v1:0'
*/
public readonly profileArn: string;
constructor(scope: IConstruct, id: string, props: InferenceProfileProps) {
super(scope, id);
this.profileId = `${props.geoRegion}.${props.model.modelId}`;
this.profileArn = Arn.format({
service: 'bedrock',
resource: 'inference-profile',
resourceName: this.profileId,
arnFormat: ArnFormat.SLASH_RESOURCE_NAME,
}, Stack.of(scope));
}
}
doing so, the definition of the Inference Profile would be as follows:
new BedrockInferenceProfile(this, 'InferenceProfile', {
region: InferenceProfileRegion.EU,
model: BedrockFoundationModel.ANTHROPIC_CLAUDE_SONNET_V1_0,
})
Other Information
No response
Acknowledgements
- [X] I may be able to implement this feature request
- [ ] This feature might incur a breaking change