synapse icon indicating copy to clipboard operation
synapse copied to clipboard

Add background job to clear unreferenced state groups

Open erikjohnston opened this issue 10 months ago • 13 comments

After fixing https://github.com/element-hq/synapse/issues/9406 and https://github.com/element-hq/synapse/issues/17937, we still have a bunch of unreferenced state groups in the DB which point to the full state, causing lots of unnecessary DB usage. We should add a background job to go and delete unreferenced state groups.

Note that we can still have new unreferenced state groups, just that when we purge history we won't de-delta the state group entries and instead delete them.

A one-off background job to delete unreferenced state groups would:

  1. Record the current max state group
  2. Each run would look at the next N state groups (in ascending order) via state_groups table, and check if there are any unreferenced ones. Note that we need to check both event_to_state_groups and state_group_edges tables.
  3. Call _find_unreferenced_groups on any unreferenced state groups to get others that can be deleted if the given state group is deleted.
  4. Call mark_state_groups_as_pending_deletion to schedule them for deletion

I'm also wondering if instead of having this as a one off job we do this periodically to catch new unreferenced state groups.

erikjohnston avatar Feb 10 '25 13:02 erikjohnston

While implementing this I noticed that when the deletion task runs, it doesn't clean up the state_groups_pending_deletion table. It leaves the entries in there.

We'll need to address that as well, otherwise that table will grow endlessly.

devonh avatar Feb 12 '25 01:02 devonh

Also - should we be cleaning up any state_group_edges that are dangling after the deletion as well?

devonh avatar Feb 12 '25 21:02 devonh

@erikjohnston thoughts on cleaning up state_groups_pending_deletion & state_group_edges tables? ^

devonh avatar Feb 12 '25 23:02 devonh

Any idea which Synapse release we would be targeting for this? (re https://github.com/element-hq/backend-internal/issues/75#issuecomment-2653987772)

pmaier1 avatar Feb 13 '25 07:02 pmaier1

@erikjohnston thoughts on cleaning up state_groups_pending_deletion & state_group_edges tables? ^

Err, yes they should be cleaned up too

erikjohnston avatar Feb 13 '25 12:02 erikjohnston

State Group table cleanup PR: https://github.com/element-hq/synapse/pull/18165

devonh avatar Feb 14 '25 21:02 devonh

Question, will this change make the periodic runs of (rust-)synapse-compress-state unnecessary?

schildbach avatar Mar 07 '25 20:03 schildbach

Reopened by https://github.com/element-hq/synapse/commit/745cfe0055f67effd0deece074115e607c586495 due to #18217.

reivilibre avatar Mar 17 '25 13:03 reivilibre

Re-fixed by #18254 Should wait to confirm the fix on https://github.com/element-hq/synapse/issues/18217 before closing this issue.

devonh avatar Mar 24 '25 14:03 devonh

Thank you very much for implementing this 🎉, on our server, the unreferenced state groups were accounting for almost 3/4 of the total DB.

@erikjohnston:

I'm also wondering if instead of having this as a one off job we do this periodically to catch new unreferenced state groups.

It seems that it will indeed be needed, as the number of unreferenced state groups are continuing to increase on our server. Running this jobs regularly, e.g. on a weekly basis, would at least allow to mitigate this problem.

n-peugnet avatar Apr 20 '25 14:04 n-peugnet

  • #18322

3nprob avatar May 07 '25 00:05 3nprob

Hi there, on my server I noticed that it does not look like any unreferenced stategroups were deleted since this was released.

Recent output of the rust tool:

Total state groups: 130631149
Found 18637899 unreferenced groups

Further queries:

select count(*) from state_groups_pending_deletion;
 count
-------
     0
(1 row)

select count(*) from state_groups_pending_deletion_sequence_number_seq;
 count
-------
     1
(1 row)

I do not have a sharded database config.

I have a background tasks worker

# relevant worker config
worker_app: synapse.app.generic_worker
worker_name: background_tasks
# synapse config
run_background_tasks_on: background_tasks

verymilan avatar May 16 '25 17:05 verymilan

should be solved by #19181

Gredin67 avatar Nov 16 '25 10:11 Gredin67