synapse icon indicating copy to clipboard operation
synapse copied to clipboard

In sync wait for worker to catch up since token

Open erikjohnston opened this issue 1 year ago • 5 comments

Otherwise things will get confused.

An alternative would be to make sure that for lagging stream we don't return anything (and make sure the returned next_batch token doesn't go backwards). But that is a faff.

erikjohnston avatar May 18 '24 11:05 erikjohnston

Oh bleurgh, the typing stream can legitimately go backwards :roll_eyes:

erikjohnston avatar May 18 '24 12:05 erikjohnston

somewhat-flakey Complement failure is because the events stream (and possibly another) is going backwards. I don't know why this happens but we've noticed it before in Complement.

Waiting for current token to reach StreamToken(room_key=RoomStreamToken(stream=9, instance_map=immutabledict({}), topological=None), presence_key=12, typing_key=0, receipt_key=MultiWriterStreamToken(stream=1, instance_map=immutabledict({})), account_data_key=1, push_rules_key=1, to_device_key=1, device_list_key=6, groups_key=0, un_partial_stated_rooms_key=4); currently at StreamToken(room_key=RoomStreamToken(stream=8, instance_map=immutabledict({}), topological=None), presence_key=16, typing_key=0, receipt_key=MultiWriterStreamToken(stream=1, instance_map=immutabledict({})), account_data_key=1, push_rules_key=1, to_device_key=1, device_list_key=7, groups_key=0, un_partial_stated_rooms_key=4)

as an example of where it gets stuck, when locally enhanced with logging

reivilibre avatar May 18 '24 13:05 reivilibre

Right, so the problem is that there are occasions where we ask for a new stream ID from the generator, but then don't persist it in the DB. If this happens before a restart then the StreamIDGenerator will try and read the largest used value in the DB, and so the stream will start with an ID less than what it was before the restart.

This doesn't affect MultiWriterIdGenerator as on Postgres that is backed by a proper DB sequence.

I'm not quite sure what the right way forward is her here, ideally we'd make it so that stream IDs must be used, but that is hard.

erikjohnston avatar May 20 '24 11:05 erikjohnston

I'm not quite sure what the right way forward is her here, ideally we'd make it so that stream IDs must be used, but that is hard.

Actually, maybe we use stream_positions table, as https://github.com/matrix-org/synapse/pull/8374 sounds basically the same problem.

erikjohnston avatar May 20 '24 12:05 erikjohnston

I'm not quite sure what the right way forward is her here, ideally we'd make it so that stream IDs must be used, but that is hard.

Actually, maybe we use stream_positions table, as matrix-org/synapse#8374 sounds basically the same problem.

https://github.com/element-hq/synapse/pull/17226

erikjohnston avatar May 22 '24 12:05 erikjohnston