Add background job to clear unreferenced state groups
After fixing https://github.com/element-hq/synapse/issues/9406 and https://github.com/element-hq/synapse/issues/17937, we still have a bunch of unreferenced state groups in the DB which point to the full state, causing lots of unnecessary DB usage. We should add a background job to go and delete unreferenced state groups.
Note that we can still have new unreferenced state groups, just that when we purge history we won't de-delta the state group entries and instead delete them.
A one-off background job to delete unreferenced state groups would:
- Record the current max state group
- Each run would look at the next N state groups (in ascending order) via
state_groupstable, and check if there are any unreferenced ones. Note that we need to check bothevent_to_state_groupsandstate_group_edgestables. - Call _find_unreferenced_groups on any unreferenced state groups to get others that can be deleted if the given state group is deleted.
- Call
mark_state_groups_as_pending_deletionto schedule them for deletion
I'm also wondering if instead of having this as a one off job we do this periodically to catch new unreferenced state groups.
While implementing this I noticed that when the deletion task runs, it doesn't clean up the state_groups_pending_deletion table. It leaves the entries in there.
We'll need to address that as well, otherwise that table will grow endlessly.
Also - should we be cleaning up any state_group_edges that are dangling after the deletion as well?
@erikjohnston thoughts on cleaning up state_groups_pending_deletion & state_group_edges tables? ^
Any idea which Synapse release we would be targeting for this? (re https://github.com/element-hq/backend-internal/issues/75#issuecomment-2653987772)
@erikjohnston thoughts on cleaning up
state_groups_pending_deletion&state_group_edgestables? ^
Err, yes they should be cleaned up too
State Group table cleanup PR: https://github.com/element-hq/synapse/pull/18165
Question, will this change make the periodic runs of (rust-)synapse-compress-state unnecessary?
Reopened by https://github.com/element-hq/synapse/commit/745cfe0055f67effd0deece074115e607c586495 due to #18217.
Re-fixed by #18254 Should wait to confirm the fix on https://github.com/element-hq/synapse/issues/18217 before closing this issue.
Thank you very much for implementing this 🎉, on our server, the unreferenced state groups were accounting for almost 3/4 of the total DB.
@erikjohnston:
I'm also wondering if instead of having this as a one off job we do this periodically to catch new unreferenced state groups.
It seems that it will indeed be needed, as the number of unreferenced state groups are continuing to increase on our server. Running this jobs regularly, e.g. on a weekly basis, would at least allow to mitigate this problem.
- #18322
Hi there, on my server I noticed that it does not look like any unreferenced stategroups were deleted since this was released.
Recent output of the rust tool:
Total state groups: 130631149
Found 18637899 unreferenced groups
Further queries:
select count(*) from state_groups_pending_deletion;
count
-------
0
(1 row)
select count(*) from state_groups_pending_deletion_sequence_number_seq;
count
-------
1
(1 row)
I do not have a sharded database config.
I have a background tasks worker
# relevant worker config
worker_app: synapse.app.generic_worker
worker_name: background_tasks
# synapse config
run_background_tasks_on: background_tasks
should be solved by #19181