synapse icon indicating copy to clipboard operation
synapse copied to clipboard

background_drop_invalid_event_edges_rows failed

Open tusooa opened this issue 2 years ago • 5 comments

Description

Background update to drop invalid event edge rows failed.

Steps to reproduce

  • Start synapse

Homeserver

tusooa.xyz

Synapse Version

1.90.0

Installation Method

Docker (matrixdotorg/synapse)

Database

postgresql 13. single server. yes, used portdb. yes, once restored.

Workers

Multiple workers

Platform

Ubuntu 22.04, Kubernetes (4-node kubeadm cluster)

Configuration

No response

Relevant log output

2023-08-27 23:49:32,646 - synapse.storage.background_updates - 302 - ERROR - background_updates-0 - Error doing update
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/background_updates.py", line 294, in run_background_updates
    result = await self.do_next_background_update(sleep)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/background_updates.py", line 424, in do_next_background_update
    await self._do_background_update(desired_duration_ms)
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/background_updates.py", line 467, in _do_background_update
    items_updated = await update_handler(progress, batch_size)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/events_bg_updates.py", line 1408, in _background_drop_invalid_event_edges_rows
    done = await self.db_pool.runInteraction(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/database.py", line 924, in runInteraction
    return await delay_cancellation(_runInteraction())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twisted/internet/defer.py", line 1693, in _inlineCallbacks
    result = context.run(
             ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twisted/python/failure.py", line 518, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/database.py", line 890, in _runInteraction
    result = await self.runWithConnection(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/database.py", line 1019, in runWithConnection
    return await make_deferred_yieldable(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twisted/python/threadpool.py", line 244, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twisted/python/threadpool.py", line 260, in <lambda>
    inContext.theWork = lambda: context.call(  # type: ignore[attr-defined]
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twisted/python/context.py", line 117, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twisted/python/context.py", line 82, in callWithContext
    return func(*args, **kw)
           ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/twisted/enterprise/adbapi.py", line 282, in _runWithConnection
    result = func(conn, *args, **kw)
             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/database.py", line 1012, in inner_func
    return func(db_conn, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/database.py", line 752, in new_transaction
    r = func(cursor, *args, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/databases/main/events_bg_updates.py", line 1403, in drop_invalid_event_edges_txn
    txn.execute(
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/database.py", line 417, in execute
    self._do_execute(self.txn.execute, sql, parameters)
  File "/usr/local/lib/python3.11/site-packages/synapse/storage/database.py", line 469, in _do_execute
    return func(sql, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.errors.ForeignKeyViolation: insert or update on table "event_edges" violates foreign key constraint "event_edges_event_id_fkey"
DETAIL:  Key (event_id)=($ukE7gZZjUyJ6AWo3j-yziYIwtm5QBWh4_V-AwoChtqs) is not present in table "events".

Anything else that would be useful to know?

No response

tusooa avatar Aug 28 '23 02:08 tusooa

Relevant source:

https://github.com/matrix-org/synapse/blob/2b78981736f9004f99b1760e3e77b234f92755a7/synapse/storage/databases/main/events_bg_updates.py#L1402-L1406

used portdb. yes

I fear this is a consequence of https://github.com/matrix-org/synapse/issues/13191 :(

The safest option would be to purge the room, using

  • SELECT room_id FROM events WHERE event_id = '$ukE7gZZjUyJ6AWo3j-yziYIwtm5QBWh4_V-AwoChtqs';
  • then use the admin API to purge the room.

Then restart Synapse and see if the error remains. Let us know if that solves the issue.

DMRobertson avatar Aug 29 '23 14:08 DMRobertson

Relevant source:

https://github.com/matrix-org/synapse/blob/2b78981736f9004f99b1760e3e77b234f92755a7/synapse/storage/databases/main/events_bg_updates.py#L1402-L1406

used portdb. yes

I fear this is a consequence of #13191 :(

The safest option would be to purge the room, using

* `SELECT room_id FROM events WHERE event_id = '$ukE7gZZjUyJ6AWo3j-yziYIwtm5QBWh4_V-AwoChtqs';`

* then use the [admin API to purge the room](https://matrix-org.github.io/synapse/latest/admin_api/rooms.html#version-2-new-version).

Then restart Synapse and see if the error remains. Let us know if that solves the issue.

When I am purging the room, it failed with another error:

{"status":"failed","shutdown_room":{"kicked_users":[],"failed_to_kick_users":[],"local_aliases":[],"new_room_id":null},"error":"canceling statement due to statement timeout\nCONTEXT:  SQL statement \"SELECT 1 FROM ONLY \"public\".\"room_memberships\" x WHERE $1 OPERATOR(pg_catalog.=) \"event_stream_ordering\" FOR KEY SHARE OF x\"\n"}

tusooa avatar Sep 11 '23 21:09 tusooa

When I am purging the room, it failed with another error:

This looks like a regression in https://github.com/matrix-org/synapse/pull/15853. When there's a fix for #16322, please try purging that room again.

DMRobertson avatar Sep 18 '23 10:09 DMRobertson

When there's a fix for #16322, please try purging that room again.

A fix (https://github.com/matrix-org/synapse/pull/16455) landed in Synapse 1.95. Have you had the chance to try re-purging the room?

DMRobertson avatar Nov 23 '23 16:11 DMRobertson

When there's a fix for #16322, please try purging that room again.

A fix (#16455) landed in Synapse 1.95. Have you had the chance to try re-purging the room?

still failed (v1.97.0)

2023-12-12 06:49:39,218 - synapse.storage.txn - 780 - WARNING - task-shutdown_and_purge_room-0-lZuEfLZoWVFejzsJ-!YTvKGNlinIzlkMTVRl:matrix.org - [TXN OPERROR] {purge_room-47c} canceling statement due to statement timeout
2023-12-12 06:49:39,245 - synapse.util.task_scheduler - 362 - ERROR - task-shutdown_and_purge_room-0-lZuEfLZoWVFejzsJ - scheduled task lZuEfLZoWVFejzsJ failed

tusooa avatar Dec 13 '23 00:12 tusooa