readthedocs.org
readthedocs.org copied to clipboard
Build: reducing query count, prefetching, or intermediate tables
After a few different attempts, it seems like the query complexity added with prefetching causes queries to take much longer than without prefetching (however query count is much much lower, 30 vs 120 queries).
For more background on the removal, see:
- https://github.com/readthedocs/readthedocs.org/pull/11621
The most expensive queries here are getting the latest build (which I think can be external) and getting the latest, non-external, successful build.
Some options here are:
- Use caching after database query. Saves some lookups but initial is still expensive and savings inconsistent.
- Use materialized views to cache at the database level. Managing these views is harder and outside ORM
- Use an intermediate table for latest/latest successful build on Project. This requires updating the relationship after each build.
Use an intermediate table for latest/latest successful build on Project. This requires updating the relationship after each build.
Do we need an intermediate table for this? Why not just OneToOne fields on the Project object?
Yup, it could be any relation, not necessarily an intermediate table.
I thought a little more about this and I realized that:
- We need to use an API call from the builder to update the DB object
- We can use
on_successandon_failureCelery handlers to updatelastest_successful_buildandlatest_buildsfields respectively - It's not clear if we need these latest builds to be linked to the
Projector to theVersionobjects 🤔
It's not clear if we need these latest builds to be linked to the Project or to the Version objects
Good point, I think both if we want to speed up both pages. Project and version listing both show the latest build and use has_good_build.