bedrock-access-gateway icon indicating copy to clipboard operation
bedrock-access-gateway copied to clipboard

Issue with concurrent requests on AWS Fargate

Open eliran89c opened this issue 1 year ago • 8 comments

Describe the Bug I am encountering an issue where concurrent requests are being processed sequentially rather than simultaneously when deployed on AWS Fargate. I suspect the problem is that boto3 runs synchronously, and its calls are blocking.

API Details

  • API Used: /chat/completions
  • Model Used: all of them

To Reproduce Steps to reproduce the behavior:

  1. Deploy the service on AWS Fargate following the standard setup procedures.
  2. Send multiple concurrent requests (e.g., 10 concurrent requests) to the API.
  3. Observe that the requests are processed sequentially instead of concurrently.

Expected Behavior I expected that when sending multiple concurrent requests to the API, all requests would be handled simultaneously or at least as many as the server can handle

eliran89c avatar Jun 20 '24 18:06 eliran89c

Concurrency and asynchronous call is natively supportted by FastAPI, I did a quick test with 2 concurrency requests (with long response) and I can see both are streaming in parallel, I didn't test via code though.

You can probably try below:

  1. Try fewer requests (like 2 requests) first and see if the issue still exists.
  2. Try to test in local (The code can run locally)
  3. Try to increase the capacity of Fargate (By default, it has only 1 core, I would expect it may not support larger concurrent requests) and retest

daixba avatar Jun 21 '24 02:06 daixba

Hi @daixba, I forgot to mention that I'm not streaming the response With streaming, it works better, but it is still not perfect (I monitor the health-check endpoint, and it times out from time to time)

But without streaming, the API is waiting for each request to finish before being able to handle other requests

Concurrency and asynchronous call is natively supported by FastAPI

I agree; This is why I think the problem with boto3

eliran89c avatar Jun 23 '24 09:06 eliran89c

@daixba when I run boto3 with asyncio it's working as expected https://github.com/aws-samples/bedrock-access-gateway/pull/23

eliran89c avatar Jun 23 '24 11:06 eliran89c

所以这个能解决吗,我的大并发请求一遇到非流式就没办法

QingyeSC avatar Sep 13 '24 08:09 QingyeSC

This is not a problem with Fargate's capacity, it's due to the fact that we're using block code in the loop.

Let me explain this: in fastapi, an async handler runs in a loop, a sync handler is wrapped to a thread and then runs in a loop

Therefore, when there is blocking code in an async handler, it will block the whole server.

Usually, we all understand the following code:

import time
import asyncio
from fastapi import FastAPI

app = FastAPI()


@app.get("/")
async def root():
    await asyncio.sleep(1000) # Won't block
    time.sleep(1000) # Will block
    return {"message": "Hello World"}

Yet

import time
import asyncio
from fastapi import FastAPI

app = FastAPI()


@app.get("/") 
def root():
    time.sleep(1000) # Won't block!
    return {"message": "Hello World"}

Wh1isper avatar Jan 09 '25 09:01 Wh1isper

@QingyeSC If you're in a hurry, you can build the image yourself from https://github.com/aws-samples/bedrock-access-gateway/pull/23. If you want selfhost, I forked my version https://github.com/Wh1isper/bedway

Wh1isper avatar Jan 09 '25 09:01 Wh1isper

Also ran into this issue, fixed it by subclassing the BedrockModel class as AsyncBedrockModel in a separate module and adding aioboto3 support to keep the syntax similar and not touch the main code to allow pulling from upstream easier when needed.

Hopefully pull #23 gets approved though 👍

m-navarro93 avatar Mar 13 '25 01:03 m-navarro93

Sorry, it's been a long time to address this issue.

Now the performance is improved based on my test. Now this project make async call to converse api. We don't need to use aioboto3 here. Check the 0ead770069a47a3342e68096a29e815f08567687 for more details

Simply redeploy or update the container image to have a try!

Please let me know if any feedbacks.

daixba avatar Mar 13 '25 10:03 daixba