wellcomecollection.org
wellcomecollection.org copied to clipboard
Fix catalogue-api request errors during deployment
When the catalogue-api is deployed we consistently see request errors occur that result in errors appearing in wc-platform-alerts. It appears these are real errors that will be exposed to users albeit for a very short period during deployment.
The underlying problem appears to be a failure to properly check if the catalogue-api service is running before allowing it to serve requests to users.
See this thread for related discussion: https://wellcome.slack.com/archives/CQ720BG02/p1702567453749849
A potential fix for this issue is to move to HTTP health-checks rather than TCP and have them hit a path served by the scala app, to better allow the load balancer to understand if the api service is healthy and ready to serve requests.
[!NOTE] This may be a more general problem with our API based services and require modifying some of the underlying infrastructure templates, it's worth investigating if this happens elsewhere in our estate.
This isn't really in progress, but needs to be triaged in the next sprint but i'm not sure how we handle that, so it's in progress to make us discuss it and properly dealt with.
@kenoir I've added it to our planning doc. Just let me know if there's anything to talk about and I'll add it so we don't forget.
Closing as done from:https://github.com/wellcomecollection/wellcomecollection.org/issues/10545