sample-python-helper-aws-appconfig icon indicating copy to clipboard operation
sample-python-helper-aws-appconfig copied to clipboard

Getting timeouts when running on AppRunner

Open TreyWW opened this issue 1 year ago • 7 comments

I'm using the same config locally and on app runner, with the same variables, and yet Gunicorn times out as soon as AppConfig gets called.

E.g. I have this code:

print(f"""Using these variables:
    APPLICATION: {self.APPLICATION}
    ENVIRONMENT: {self.ENVIRONMENT}
    PROFILE: {self.PROFILE}
    UPDATE_CHECK_INTERVAL: {self.UPDATE_CHECK_INTERVAL}
""", flush=True)
self.appconfig: AppConfigHelper = AppConfigHelper(
    self.APPLICATION,
    self.ENVIRONMENT,
    self.PROFILE,
    self.UPDATE_CHECK_INTERVAL,  # minimum interval between update checks (SECONDS)
    fetch_on_init=True,
    fetch_on_read=True
)
print("[BACKEND] Got past app config init... Attempting test", flush=True)

Then i get this in my logs:

02-17-2024 11:34:44 PM Using these variables:
02-17-2024 11:34:44 PM                 APPLICATION: our-app
02-17-2024 11:34:44 PM                 ENVIRONMENT: Staging
02-17-2024 11:34:44 PM                 PROFILE: our-app
02-17-2024 11:34:44 PM                 UPDATE_CHECK_INTERVAL: 45
02-17-2024 11:34:44 PM             
02-17-2024 11:35:15 PM [2024-02-17 23:35:15 +0000] [9] [CRITICAL] WORKER TIMEOUT (pid:11)
02-17-2024 11:35:15 PM [2024-02-17 23:35:15 +0000] [11] [INFO] Worker exiting (pid: 11)
02-17-2024 11:35:16 PM [2024-02-17 23:35:16 +0000] [9] [ERROR] Worker (pid:11) exited with code 1
02-17-2024 11:35:16 PM [2024-02-17 23:35:16 +0000] [9] [ERROR] Worker (pid:11) exited with code 1.
02-17-2024 11:35:16 PM [2024-02-17 23:35:16 +0000] [12] [INFO] Booting worker with pid: 12

And this runs a few times. (This also gets accepted into AppRunner bypassing the "Health Check" which is quite concerning for production, but there you go)

Locally it finishes within a second, so I'm not sure why it hangs on AppRunner.

Any help would be appreciated :)

TreyWW avatar Feb 17 '24 23:02 TreyWW

I haven't tried it in AppRunner myself, but can you confirm if your environment has outbound Internet access? It'll need that in order to reach the AppConfig API endpoint. There's some documentation here around it: https://docs.aws.amazon.com/apprunner/latest/dg/network-vpc.html (I see there's a note about one-time latency for a custom VPC connector for outbound traffic too.)

(Edit: If your VPC doesn't have internet access and you don't want it to, you could look at VPC Endpoints for AppConfig instead - on the same page as linked above.)

jamesoff avatar Feb 19 '24 14:02 jamesoff

Hi @jamesoff, Yeah it has all traffic outbound and the VPC does have public access

For now I'm moving to redis and RDS and im just caching some custom flags. Kind if a pain

TreyWW avatar Feb 19 '24 15:02 TreyWW

Ok, thanks for the confirmation. I'll give it a go myself to see if I can reproduce it. If you just need some small values or flags, Systems Manager Parameter Store may be a suitable alternative in the meanwhile?

jamesoff avatar Feb 19 '24 17:02 jamesoff

Systems Manager Parameter Store may be a suitable alternative in the meanwhile?

Yeah in a sense, we use SSM for our environment variables. It's just the integration for real time feature flags doesn't seem as good. I just wanted that "live" feel. Running locally was really cool, instant responses, didn't cause any delay to our app, and just worked. But yeah if we can't sort this out I'm fine to kind of keep a DB/Cache alternative. But would be nice to get this working, thanks for the help!

TreyWW avatar Feb 19 '24 17:02 TreyWW

I had a go at reproducing this and it worked for me. I created an app using the sample code from the README of the repo, and created a container with it which I set up with AppRunner using all the defaults (I did add a region env variable; not sure if that's required or not, this is my first hands-on with AppRunner). The service launches fine and returns the configuration data as the response.

I've put the files I used for testing, along with the output of AppRunner's DescribeService call, in a gist in case you want to compare: https://gist.github.com/jamesoff/c6024b79d1cef88769020bd4cfcefc82

jamesoff avatar Feb 20 '24 07:02 jamesoff

Apologies for the late response, I never got the notification.

Ah yeah you were initially correct. The timeout is due to the HTTP requests not making it out, as the outgoing network traffic is set to custom VPC. It needs to be in the VPC because of my RDS db and cache, etc, but it also needs public access to access the AWS API to manage services. The VPC has outgoing * and inbound * but obviously that doesn't do anything because the AWS Rest API is connected through *.amazonaws.com. Any ideas on the cheapest way to do this? I've seen people suggest NAT gateways or PrivateLink but both of them are really expensive, i want a free solution to connect to AWS API.

TreyWW avatar Mar 10 '24 16:03 TreyWW

No problem on the delay :)

I'm afraid NAT Gateways or a PrivateLink endpoint for AppConfig would be the only ways to make the API reachable from your environment.

jamesoff avatar Mar 11 '24 16:03 jamesoff