matrix-rust-sdk
matrix-rust-sdk copied to clipboard
[meta] Event Cache API 🕸
Context
After working for a bit on the Rust SDK, I've observed it seems there's a missing layer between the high-level Timeline in the UI crate, and the Rust SDK itself. In addition to rendering the events themselves, the Timeline is involved in a lot of functionality:
- it computes the read receipts displayed on each event in a room,
- it runs backpagination, keeping track of the
prev_batch
tokens, re-spawning requests to/messages
until it reaches a desired amount of messages, or the end of a timeline, - it retries decrypting events after new keys have been received, be it from the encryption sync loop, or from the backup storage
These extra responsibilities make the Timeline more a view-and-controller bundle, as opposed to a pure view over the events it receives. This tight coupling comes as an advantage (the code freely does what it needs when it needs to), but also it comes with a few drawbacks: it's harder to test and fix a logic bug in isolation without rendering a timeline, and all that functionality is not reusable by other clients who wouldn't want to render as a timeline.
Since rendering a timeline requires subscribing to the room it belongs to, and we have requirements for other features that don't involve a specific room, but the entire room list, some of that functionality is re-implemented elsewhere:
- renewed attempts to decrypt still-encrypted events happen when computing the
latest_event
for a room, so as we have something to display in the room list, - the room unread markers (namely, the unread counts, # of new messages/notifications/mentions) are also needed for all the rooms, not only a room we'd be visiting (or the whole point of the unread marker would be lost)
- in addition to that, while the latest events and unread counts are technically information derived (computed) from other events, it's serialized and stored into the database as part of the
RoomInfo
struct, into the state store. - the scattering of unread counts and read receipts has caused some impedance mismatch, exemplified as https://github.com/matrix-org/matrix-rust-sdk/pull/3054 (the unread counts computation expected read markers even for our own events, while the timeline code handling read receipts took into account implicit read receipts)
The Proposal
Overall, my take is that all this kind of functionality is really nice to have, when writing any Matrix client. While it's possible that we could continue like we've done so far, I think that having a new abstraction to provide all this functionality, independently of a given observed room, would be super nice and useful. I call this new abstraction the Event Graph (API, if you insist).
The goals are the following:
- make the Timeline (mostly) a view over the events observed via the Event Graph, for a clearer and cleaner separation of responsibilities
- make the code more modular and easier to test in isolation, to improve maintainability
- unify some functionality that's scattered in multiple places (notably automatic re-decryption, unread markers with read receipts)
- make the Event Graph persistent (circumventing the need for a specific Timeline API cache, evoked in #1103)
- if a Timeline is just a view of the event graph, then a Timeline can easily be re-rendered from the cache
- the storage of the event graph could be considered an on-disk cache, that we could remove/empty at will (the only consequence would be degrading the user experience by causing new backpagination for events we had before emptying the cache)
- we could make it work across processes right from the beginning, allowing us to reuse events received from a notification sync into a main timeline
- pave the way for future features, like full-text search
Unknowns
- One thing to investigate is whether we could make the Event Graph have no in-memory caches of the on-disk data. This would remove the need for the cross-process lock altogether by considering the database as "the single source of truth", but it might have a bad impact on performance.
- #1103 contains a few interesting remarks and caveats about memory usage and having a timeline cache.
- it's a bit of work to disentangle all the code in the timeline, and it's hard to estimate the time it'll take.
Drawbacks
- it may cause a bit of churn by introducing new bugs, while we try to add new features at the same time
- some problems may be challenging to solve, e.g. reconciling the cache with new sync events / backpagination. (@Hywan and I dare to call this "fun".)
TL;DR
This is a proposal to weed out some functionality out of the timeline and matrix-sdk-base client into a new Event Graph API, to make it simpler to add new features in the future, that will benefit all the users of the Rust SDK, for Fractal, all the Element(X) apps, and all the users of the SDK.
Subprojects
- [x] initial implementation to get a sense how components would fit together
- [x] move backpagination support into the event graph (EG), so EG is aware of all events within a room
- #3280
- #3872
- #4112
- #4113
(moved to OP)
Finally read this through fully. fwiw, from my side, the overall idea is great. However (as per our chat on mon):
- Please don't call it an 'event graph' API - we use this exclusively in Matrix to refer to the federation DAG (e.g. grep https://spec.matrix.org/unstable/client-server-api/ and friends for
event graph
) - A better name might be Event Store (or Event Cache, although cache overemphasizes transience?)
- The idea of pushing the timeline API room-pagination logic down into a general-purpose "please do stuff with events from this room" seems very sane
Event Cache
is nice: it had been suggested to me privately by someone else, it also makes it clear that it can be cleared up at any time, "just" resulting in a lesser user experience. Also I haven't had any other suggestions, and I'd like to move on with the renaming, so let's settle on EventCache
at the moment :grin:
Something we want to be sure to not reproduce in the Rust SDK: https://github.com/element-hq/element-web/issues/18325#issuecomment-1264038949
A few additional things to not reproduce in the rust SDK:
- https://github.com/element-hq/element-web/issues/23393
- https://github.com/element-hq/element-web/issues/20927#issuecomment-1820465908