bot icon indicating copy to clipboard operation
bot copied to clipboard

Handle decoding errors in file inspection during filtering

Open sentry[bot] opened this issue 4 months ago • 5 comments

Sentry Issue: BOT-436

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf5 in position 14: invalid start byte
  File "bot/exts/filtering/filtering.py", line 237, in on_message
    await _extract_text_file_content(a)
  File "bot/exts/filtering/filtering.py", line 73, in _extract_text_file_content
    file_lines = file_content_bytes.decode(file_encoding).splitlines()

Unhandled exception in on_message.

It's not obvious to me what would be appropriate here. Block the file? Ignore unknown characters?

sentry[bot] avatar Aug 17 '25 17:08 sentry[bot]

@swfarnsworth your input would be appreciated

mbaruh avatar Aug 17 '25 17:08 mbaruh

@mbaruh I think this is the first time this has happened? We could just add errors='ignore'.

swfarnsworth avatar Aug 17 '25 18:08 swfarnsworth

In this case it seems like a user uploaded a zip file with a txt extension. I assume the same could happen if you tried to share an executable with a txt extension? If so it could be an indication of somebody trying to bypass filters, though seemlingly rare so not a high priority.

wookie184 avatar Sep 15 '25 21:09 wookie184

What was the value of file_encoding in this error?

onerandomusername avatar Oct 22 '25 00:10 onerandomusername

What was the value of file_encoding in this error?

"utf-8"

wookie184 avatar Nov 20 '25 08:11 wookie184