Record source state changes and expose them via the API
In order to facilitate more effective synchronization such as the strategies discussed https://github.com/freedomofpress/securedrop-client/issues/1549, it is very likely that we will want to enable clients to fetch only sources whose state has been modified. State changes may include:
- New source messages/files (covered with the existing
last_updatedtimestamp) - New journalist replies
- Star status changes
- Changes to seen/unseen state of any associated messages or files
- Deletion of files or messages without deletion of the source itself
- "Read receipts" from sources (triggered when a source "deletes" journalist replies)
Any such state change should cause a client to update its representation of the source.
This could be implemented as, e.g.:
- a timestamp
- a version integer
A timestamp may be appealing because it would give the client the ability to specify a cutoff point in the sync query itself, to only receive a list of sources with updates greater than some time. However, this would require additional support to track deletions of entire sources.
(This is an alternative to an API endpoint that would return a changes-feed. It is orthogonal to #4863, which should be considered independently if we want to show timestamps for the most recent journalist activity.)
Acceptance criteria
Given that I am maintaining a client that interacts with the server via the API When I query the list of sources Then I should be able to determine from the response whether I need to fetch updates for any given source
If (as under consideration elsewhere) the Client were to consume and persist locally the API's JSON representations, another strategy for incremental updates would be:
- The bulk endpoint (like freedomofpress/securedrop-client#1549) for a given resource returns a list of
(key, hash(value))pairs (realistically, a{key: hash(value)}dictionary), wherevalueis the JSON object forkey. - Initially, the Client requests all
keys, retrieves theirvalues, and saves theirhash(value)s. - Whenever a resource changes on the Server,
valuechanges, sohash(value)changes, so the Client knows to update thatkey.
Pros:
- No need to enforce a monotonic version counter on the Server:
hash(value)is justhash(instance.to_json())for a given SQLAlchemy modelinstance. - No need to check logic around a monotonic version counter on the Client. It's already harder than it should be to manage the source-level submission counter used to index conversation items; let's not add another.
Cons:
- Larger on the wire: If
keyis a UUID,(key, hash(value))will be roughly twice the size of(key, version)for an integerversion. (But even this would only bump freedomofpress/securedrop-client#1549's example from 15 to 30 KiB, still much smaller than the current 1 MiB.) - More expensive to compute. But writes are rare, and we control all of them, so we can just cache
hash(value)at the same time as we writevalue(or the data from which it's derived).