jupyter-ai
jupyter-ai copied to clipboard
[v3.x.x] Better cross-region inference for Amazon Bedrock
trafficstars
Description
Cross-region inference (CRI) allows requests to be automatically routed within any set of regions, which mitigates restrictions imposed by service quotas or peak usage times.
CRI is also required to use some models on Amazon Bedrock, notably Llama 3.2. A previous attempt at implementing Llama 3.2 support in Amazon Bedrock was stalled due to lack of existing support for CRI: #1014
Proposed solution
Jupyter AI needs to provide some user interface for supporting CRI. Tentatively, our proposal is to:
- Implement a new dropdown field feature that allows for one option to be selected out of multiple.
- Use this dropdown field in the Amazon Bedrock provider to allow users to specify a region area. Region areas include:
us,us-gov,eu,apac.- Ideally, this field should only appear on models that support CRI.
- List of supported regions & models for inference profiles: https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html
- Prepend the region area to the model ID to produce an inference profile ID in the format
<region-area>.<model-id>. When passed to Bedrock APIs, this allows for CRI and allows for usage of Llama 3.2 models on Amazon Bedrock.