whitenoise icon indicating copy to clipboard operation
whitenoise copied to clipboard

Dealing with deployments

Open WhyNotHugo opened this issue 5 years ago • 5 comments

This is honestly a question, and no so much a bug. Appreciate you effort put into whitenoise.

Whenever a deploy my app, a new instance of version N starts up. Traffic is then routed to it, and finally, the instance of version N-1 is terminated.

During this extremely small timeframe, it's happened several times that:

  • A request comes in for a page, and is rendered by version N.
  • This page has a static file.
  • The client browser requests the staticfile. This goes through cloudfront, which then requests the file to my app.
  • The cloudfront request hits my N-1 instance, so gets a 404.

Not only does the client end up getting a 404 (sometimes for the css file), but that also ends up being cached on CloudFront. All subsequent visitor get a completely broken website, until I invalidate on CloudFront, or deploy a new version.

I can't quite grasp how to work around this, and I'm wondering if you [the devs], or other users have managed to work around this issue. I simply cannot think of a simple way to fix this.

WhyNotHugo avatar Nov 09 '20 16:11 WhyNotHugo

There's been some discussion of this problem already in #245 but no really good solutions yet.

I'd say the most important thing is to return no-cache with 404 responses so that the broken responses don't get cached. That still doesn't solve the problem of some users getting a broken experience if their requests land during a deploy though.

If I ever get some free time I'll try to think about alternative approaches :smiley:

evansd avatar Nov 09 '20 17:11 evansd

Thanks. I'll give that a shot, since it should reduce the impact of the issue a lot.

If you have any crazy ideas but not time to implement them, let me know and I might give it a shot.

WhyNotHugo avatar Nov 09 '20 18:11 WhyNotHugo

If you have any crazy ideas

https://github.com/evansd/whitenoise/issues/245#issuecomment-822720671

rafikdraoui avatar Apr 19 '21 19:04 rafikdraoui

@WhyNotHugo Is this really something that should be accounted for by application code/libraries? This seems more of an issue with how and where you're deploying to.

Don't get me wrong: I'm in a similar situation with deployment taking between 60 to 100 seconds, which has standard health checks failing and I can't put a health check grace period in place because of a desire for rapid scaling. But I'm in this situation because I've left container orchestration to ECS; I'm not managing when traffic is directed to a new deployment on my own.

socketbox avatar Nov 02 '22 19:11 socketbox

There is no obvious way for an application or library to deal with this. During deployments, a client might load a page which is served by the old version of the app. The page points to an image which is part of staticfiles. The image is requested, but the request may land on the newer instance of the application.

I've worked around this by changing how I handle staticfiles entirely; I push the static files into an S3 bucket before deployments. Because filenames are hashed, files from different app versions can co-exist. So no matter which version of the application renders the page, the static file will always be server properly.

I don't think whitenoise is suitable for websites that want to deploy during periods of client traffic. Mind you, I'm not saying it's bad; it's perfect if you have only internal users who don't mind refreshing the page when you're deploying, if you deploy at off-peak times or quite a few other scenarios. But if you need to push updates continuously, clients will hit this issue from time to time.

WhyNotHugo avatar Nov 03 '22 09:11 WhyNotHugo