SmokeDetector
SmokeDetector copied to clipboard
Watching a multiline string causes problems
[!CAUTION] Do not attempt to reproduce this on the production SmokeDetector instance. It will very likely break stuff and cause problems.
Summary
Attempting to use !!/watch-force
to watch a multiline chat message breaks the watched_keywords.txt
file by improperly inserting a newline.
Background information
There was a desire to modify a watch in CHQ, but the watch was longer than the limit for a 'normal' chat message. This isn't inherently an issue if it gets modified without the use of chat.
While multiline messages can get around that restriction, that also introduces a newline character (\n
). From the regex perspective, that can be addressed (either by adding {0}
directly after the newline or by starting a regex comment before the newline and ending it afterwards, as noted by @makyen).
However, it also breaks SmokeDetector as that inserts a newline directly into the watched_keywords.txt
file, which breaks it. This was wisely foreseen by Makyen, hence not testing on production.
Details
- For reference, here is the string I tested with. The newline (created with Shift + Enter) has been replaced with
\n
for this GitHub issue.000000[trimmed for brevity]000000(?# foo\nbar)
-
SD can receive multiline chat messages, but at least in
!!/bisect
, it treats it as a\n
. It just handles them problematically for watches (and presumably blacklists). - Attempting to watch a multiline string in chat, as I tried to here, breaks stuff. Specifically, it produced this as the end of
watched_keywords.txt
.
And this console error:65274 1723875289 cocomac 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000(?# foo 65275 bar)
[06:19:34.736] watched_keywords.txt:65275:not enough values to unpack (expected 3, got 1) File "/home/ubuntu/SmokeDetector/blacklists.py", line 105, in parse when, by_whom, what = line.rstrip().split('\t') ^^^^^^^^^^^^^^^^^^^ [06:19:40.636] Global blacklists loaded
- Here is the relevant part of the console logs after attempting a bisect of some text that would match the regex.
[06:21:42.072] Command received: !!/bisect 00000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000<span>…</span> [06:21:50.007] Attempted to send status ping but metasmoke_host is undefined. Not sent. [06:22:10.883] regex._regex_core.error: missing ) at position 975 2024-08-17 06:22:10.878529 UTC File "/home/ubuntu/SmokeDetector/chatcommunicate.py", line 530, in __call__ result = self.__func__(*processed_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1691, in bisect results_bookended = bisect_regex_in_n_size_chunks(64, s, bookended_regexes, bookend=True, timeout=timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1624, in bisect_regex_in_n_size_chunks results = bisect_regex(test_text, chunk, bookend=bookend, timeout=timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1573, in bisect_regex compiled = regex_compile_no_cache(formatted_regex, city=findspam.city_list, ignore_unused=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/SmokeDetector/helpers.py", line 491, in regex_compile_no_cache return regex_raw_compile(regex_text, flags, ignore_unused, kwargs, False) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/regex/regex.py", line 542, in _compile raise error(caught_exception.msg, caught_exception.pattern, File "/home/ubuntu/SmokeDetector/chatcommunicate.py", line 530, in __call__ result = self.__func__(*processed_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1691, in bisect results_bookended = bisect_regex_in_n_size_chunks(64, s, bookended_regexes, bookend=True, timeout=timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1624, in bisect_regex_in_n_size_chunks results = bisect_regex(test_text, chunk, bookend=bookend, timeout=timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1573, in bisect_regex compiled = regex_compile_no_cache(formatted_regex, city=findspam.city_list, ignore_unused=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/SmokeDetector/helpers.py", line 491, in regex_compile_no_cache return regex_raw_compile(regex_text, flags, ignore_unused, kwargs, False) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/dist-packages/regex/regex.py", line 542, in _compile raise error(caught_exception.msg, caught_exception.pattern,
What problem has occurred? What issues has it caused?
It broke. Details are above. As this was on a testing instance, it didn't impact the production instance.
On my testing instance, it broke the watched_keywords.txt
file, which would need manual intervention to fix. When I tested locally, it failed to push the broken watched_keywords.txt
file to GitHub, but I don't know if that's due to my testing setup or this bug. Various Git-related SD commands might be able to restore it to a functional state, although I did not test that, so I cannot confirm if that would be a remedy if this happened to a production SD instance.
What would you like to happen/not happen?
"like to happen"
- Reject a multiline chat watch with a helpful message. Something like Multiline chat watches are not supported. Consider using Git (or pinging someone that can) as an alternative. might work*
- Accept the multiline watch and correctly input it into the relevant file. This would also require deciding how to treat the newline (leave it as-is, remove the newline, remove everything after & including the first newline, etc.)
*I don't know what the preference is on the best way to request a manual Git-based change, so if that path is chosen, someone should probably confirm that wording.
"like to not happen"
It should not corrupt files.