`cache_invalidation_stream_by_instance` grows without bounds and causes slow startup
cf https://github.com/matrix-org/synapse/issues/7968#issuecomment-688307145
it seems like we ought to be able to clear these out, but I don't know how we can tell?
Yup, we can clean things out. It only ever gets read sequentially by the workers, so once a row has been read by all running processes then it can be deleted. Also note that when starting up a worker will read from the most recent row, so it is also safe to delete all the rows if all Synapse processes have stopped.
The hard part is to figure out which rows all processes have seen. As a bit of a bodge I suggest we just delete all rows with an invalidation_ts over a month old, which should definitely be safe. We might also want to add some logic to the read part to ensure that it exits if the rows of the cache its reading is more than a week old, or something.
The cache_invalidation_stream_by_instance doesn't have an index on invalidation_ts, so I think the best way of deleting the rows is pulling out the 100th row (ordered by stream_id), check if its old enough, and then delete all rows before it if so.
this is a subset of https://github.com/matrix-org/synapse/issues/5888.
see also https://github.com/matrix-org/synapse/issues/13456 which discusses ways of getting rid of this table altogether.
Dupe of #3665 which will be closed by #15868.