bedrock-access-gateway rate limit for ListInferenceProfiles api

Describe the bug When use function url (streamResponse) + lambda deploy, encounter this error: Unable to list models: An error occurred (ThrottlingException) when calling the ListInferenceProfiles operation (reached max retries: 1): Too many requests, please wait before trying again. You have sent too many requests. Wait before trying again.

Please complete the following information:

[x] Which API you used: /chat/completions
[x] Which model you used: any

To Reproduce build with aws-lambda-adapter:

FROM public.ecr.aws/docker/library/python:3.12.0-slim
COPY --from=public.ecr.aws/awsguru/aws-lambda-adapter:0.9.0 /lambda-adapter /opt/extensions/lambda-adapter

WORKDIR /app
COPY ./requirements.txt /app/requirements.txt
RUN pip install -i https://mirrors.aliyun.com/pypi/simple/ --no-cache-dir --upgrade -r /app/requirements.txt
COPY ./api /app/api

CMD ["uvicorn", "api.app:app", "--port", "8080", "--reload"]

Screenshots

Feb 11 '25 10:02 zzc0430

Lambda Web Adapter will repeatlly send HTTP GET requests to your web app during cold start to check if the app is ready. By default, the GET request is send to '/' path. You can change it to bedrock access gateway's health check path '/health'. Just add an environment variable to your function.

AWS_LWA_READINESS_CHECK_PATH: /health

Feb 19 '25 03:02 bnusunny

Lambda Web Adapter will repeatlly send HTTP GET requests to your web app during cold start to check if the app is ready. By default, the GET request is send to '/' path. You can change it to bedrock access gateway's health check path '/health'. Just add an environment variable to your function.

AWS_LWA_READINESS_CHECK_PATH: /health

Before using aws-lambda-adapter, I had already implemented the health check required by the adapter, so it shouldn’t be the reason for the issue.

Feb 19 '25 05:02 zzc0430