matrix-rust-sdk icon indicating copy to clipboard operation
matrix-rust-sdk copied to clipboard

Investigate slow performance of event cache

Open bnjbvr opened this issue 4 months ago • 2 comments

  • The event cache store uses a cross-process lock to make sure that there's at most one process writing to its DB.
  • At another layer, the sqlite backend uses a write lock to prevent against bad database busy errors, which has proven super effective.

Unfortunately, it seems that the interaction of both can dramatically slow down the SDK: taking the cross-process lock will read/write into the DB into a "leases" database, making use of the sqlite write lock.

This is even worse, because the event cache store is also used for medias at the moment; ideally it'd be split off, and there's https://github.com/matrix-org/matrix-rust-sdk/issues/5410 for that.

In the meanwhile: we should also explore ways to reduce contention around the cross-process lock, so as to keep it as long as possible, maybe for a full sync update, and pass it down to methods that need it, instead of taking it randomly as we see fit. We might need to be careful around the dining philosopher's problem, though; if our locks are more fine-grained, then there are more chances we're taking multiple ones at the same time.

See also

  • https://github.com/matrix-org/matrix-rust-sdk/pull/5426 tried to process updates in parallel in the event cache, which led to a slowdown; we never had time to investigate why.
  • https://github.com/matrix-org/matrix-rust-sdk/issues/5410 would help to separate responsibilities in the event cache, and likely lower contention around the cross-process / sqlite write locks.
  • https://github.com/matrix-org/matrix-rust-sdk/issues/5500 would block the sync updates on the event cache updates, which might make the problem more prominent (i.e. easier to identify, but also may slow down everyone).

bnjbvr avatar Aug 19 '25 14:08 bnjbvr

This issue originally named Bad contention of the cross-process lock / sqlite write locks, but I'm being less and less sure this is only about that, so broadening a bit here.

bnjbvr avatar Aug 21 '25 09:08 bnjbvr

I've found https://github.com/matrix-org/matrix-rust-sdk/issues/5572, maybe it's the slowness cause we were looking for.

Hywan avatar Aug 25 '25 07:08 Hywan