bot icon indicating copy to clipboard operation
bot copied to clipboard

Antispam attempts to create duplicate DeletedMessages in the DB

Open MarkKoz opened this issue 5 years ago • 2 comments

Sentry Issue: BOT-53

ResponseCodeError: Status: 400 Response: {'deletedmessage_set': [{}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {'id': ['deleted message with this id already exists.']}, {'id': ['deleted message with this id already exists.']}, {'id': ['deleted message with this id already exists.']}, {'id': ['deleted message with this id already exists.']}, {'id': ['deleted message with this id already exists.']}, {'id': ['deleted message with this id already exists.']}, {'id': ['deleted message wi...
(1 additional frame(s) were not displayed)
...
  File "bot/cogs/antispam.py", line 80, in upload_messages
    url = await modlog.upload_log(self.messages.values(), actor_id, self.attachments)
  File "bot/cogs/moderation/modlog.py", line 57, in upload_log
    response = await self.bot.api_client.post(
  File "bot/api.py", line 127, in post
    return await self.request("POST", endpoint, raise_for_status=raise_for_status, **kwargs)
  File "bot/api.py", line 114, in request
    await self.maybe_raise_for_status(resp, raise_for_status)
  File "bot/api.py", line 104, in maybe_raise_for_status
    raise ResponseCodeError(response=response, response_json=response_json)

Task exception was never retrieved
future: <Task finished name='Task-1722931' coro=<AntiSpam._process_deletion_context() done, defined at /bot/bot/cogs/antispam.py:251> exception=ResponseCodeError()>

Background

When AntiSpam receives a message, it gets the recent message history in the channel.

https://github.com/python-discord/bot/blob/1d0cbbeb46a811b5a049d712aae1a90a1f3a7359/bot/cogs/antispam.py#L156-L167

It then passes the messages to each rule to check if any rule was violated.

https://github.com/python-discord/bot/blob/1d0cbbeb46a811b5a049d712aae1a90a1f3a7359/bot/cogs/antispam.py#L169-L178

When it finds a violated rule, it creates a deletion context. This context is associated with the channel. Any subsequent rule violations in the same channel will attempt to use the existing context rather than creating a new one.

https://github.com/python-discord/bot/blob/1d0cbbeb46a811b5a049d712aae1a90a1f3a7359/bot/cogs/antispam.py#L191-L194

When a rule is found, it breaks and doesn't check any other rules for that initial message.

https://github.com/python-discord/bot/blob/1d0cbbeb46a811b5a049d712aae1a90a1f3a7359/bot/cogs/antispam.py#L212

Each rule violation adds itself along with the relevant message history to the context.

https://github.com/python-discord/bot/blob/1d0cbbeb46a811b5a049d712aae1a90a1f3a7359/bot/cogs/antispam.py#L196-L201

When adding rules to the context, any messages already in the context are ignored.

https://github.com/python-discord/bot/blob/1d0cbbeb46a811b5a049d712aae1a90a1f3a7359/bot/cogs/antispam.py#L59-L61

10 seconds after the context is created, it is popped from the dictionary and uploaded to the DB.

https://github.com/python-discord/bot/blob/1d0cbbeb46a811b5a049d712aae1a90a1f3a7359/bot/cogs/antispam.py#L254-L261

The API will create a DeletedMessage for each message given; it currently doesn't support specifying existing DeletedMessages.

Problem

I believe that this error happens because separate rules, in separate contexts, share some messages in their history. Keep in mind that the API always tries to create any messages given. This leads to the "deleted message with this id already exists" errors.

The context is only around for 10 seconds. It is possible for a rule to demand more than 10 seconds of message history. Such a rule can be violated after the context has already been uploaded. This will result in a new context being created with some message history that is also present in an uploaded context. This wouldn't be a problem if the context was the same rather than new, as it would have ignored adding duplicate messages.

Of course, the first rule violation will try to delete the messages.

https://github.com/python-discord/bot/blob/1d0cbbeb46a811b5a049d712aae1a90a1f3a7359/bot/cogs/antispam.py#L211

However, messages are added to the context regardless of whether their deletion succeeds. In fact, they're added before the deletion is even attempted.

Furthermore, messages take some time to be deleted. A second on_message can manage to get the history before a previous rule violation deletes the messages. Keep in mind retrieving the message history is basically the first thing on_message does. The second on_message may be triggered just after the 10 seconds runs out for the first context, or it may take so long to find a rule violation that the 10 second window for the context runs out. Thus, it ends up needing a new context. Remember, the context is only accessed once a rule violation has been found, not before.

Solution

I don't have one yet. This is what we need to discuss in this issue. We also need to confirm my analysis of the problem is correct.

MarkKoz avatar Jun 15 '20 19:06 MarkKoz

Just had this issue again, but with a slightly different response.

ResponseCodeError: Status: 400 Response: {'deletedmessage_set': [{'embeds': {'0': ["Tag embed must contain one of the fields {'image', 'description', 'video', 'title', 'fields'}."]}}, {'embeds': {'0': ["Tag embed must contain one of the fields {'image', 'description', 'video', 'title', 'fields'}."]}}, {'embeds': {'0': ["Tag embed must contain one of the fields {'image', 'description', 'video', 'title', 'fields'}."]}}, {'embeds': {'0': ["Tag embed must contain one of the fields {'image', 'description', 'video', 'title', 'fi...
  File "bot/cogs/antispam.py", line 260, in _process_deletion_context
    await deletion_context.upload_messages(self.bot.user.id, self.mod_log)
  File "bot/cogs/antispam.py", line 80, in upload_messages
    url = await modlog.upload_log(self.messages.values(), actor_id, self.attachments)
  File "bot/cogs/moderation/modlog.py", line 56, in upload_log
    response = await self.bot.api_client.post(
  File "bot/api.py", line 127, in post
    return await self.request("POST", endpoint, raise_for_status=raise_for_status, **kwargs)
  File "bot/api.py", line 114, in request
    await self.maybe_raise_for_status(resp, raise_for_status)
  File "bot/api.py", line 104, in maybe_raise_for_status
    raise ResponseCodeError(response=response, response_json=response_json)

Den4200 avatar Aug 20 '20 18:08 Den4200

The background for this issue is no longer relevant given the filters rewrite, but the problem still exists https://python-discord.sentry.io/issues/4251592323/

ResponseCodeError: Status: 400 Response: {'deletedmessage_set': [{'id': ['deleted message with this ID already exists.']}]}
  File "pydis_core/utils/scheduling.py", line 257, in _coro_wrapper
    await coro
  File "bot/exts/filtering/_filter_lists/antispam.py", line 131, in process_deletion_context
    await deletion_context.send_alert(self)
  File "bot/exts/filtering/_filter_lists/antispam.py", line 192, in send_alert
    embed = await build_mod_alert(new_ctx, {antispam_list: messages})
  File "bot/exts/filtering/_ui/ui.py", line 118, in build_mod_alert
    mod_alert_message += await _build_alert_message_content(ctx, len(mod_alert_message))
  File "bot/exts/filtering/_ui/ui.py", line 69, in _build_alert_message_content
    url = await upload_log(to_upload, bot.instance.user.id, ctx.uploaded_attachments)
  File "bot/utils/messages.py", line 270, in upload_log
    response = await bot.instance.api_client.post(
  File "pydis_core/site_api.py", line 133, in post
    return await self.request("POST", endpoint, raise_for_status=raise_for_status, **kwargs)
  File "pydis_core/site_api.py", line 120, in request
    await self.maybe_raise_for_status(resp, should_raise=raise_for_status)
  File "pydis_core/site_api.py", line 94, in maybe_raise_for_status
    raise ResponseCodeError(response=response, response_json=response_json)

There are two main solutions I see here:

  1. Be very strict with uploading deleted messages, they must be messages that your code deleted without error, not just attempted to delete and may have already been deleted.
  2. Change site to allow deleted messages to exist in multiple contexts.

Considering which one is actually more useful, I'd lean towards 2. For filters at least it seems more useful to have a full record of what triggered the filter, rather than excluding messages deleted with !clean, or by another filter.

wookie184 avatar Apr 18 '24 11:04 wookie184