site
site copied to clipboard
Evaluate updating homepage repositories outside of the view
Presently, when the homepage view is opened, it will sometimes fetch new repository metadata from GitHub and update this information in the database. When data is newly fetched, this is pretty slow. The question is: Do we want to run it separately?
Benefits of running it separately:
- the view will always run quickly (as long as the cronjob to refresh data runs)
Drawbacks of running it separately:
- more components that can break, more things to monitor
- might complicate dev setup if we don't do it properly -> the endpoint should still be able to refresh on its own if no data is available so contributors can open the homepage as they are used to without any extra setup
One alternative approach that might be a bit hacky but would likely be the easiest to implement: we could start a background thread using threading that would do the updating after we've returned the response.
I would strongly favor not using a job queue such as Celery since I've had very bad experiences with it, and for this single use case it would feel kind of overblown. Additionally, implementing it would require a lot of additional monitoring, deployments, and introduce a lot more components that could fail. So I'm really not in favor of it.
Maybe a bit hacky as well, but another option that we have would be to use Django's asynchronous support to create an async task that should run after the request. I think to make that fully work we would have to run under an ASGI server though, and transform all the views to properly run with it.
Any solution here is a pain and we can probably just accept the annoyance.
If we do revisit we should just add an endpoint called by a Kubernetes cronjob.