SmokeDetector icon indicating copy to clipboard operation
SmokeDetector copied to clipboard

Watching a multiline string causes problems

Open CoconutMacaroon opened this issue 6 months ago • 1 comments

[!CAUTION] Do not attempt to reproduce this on the production SmokeDetector instance. It will very likely break stuff and cause problems.

Summary

Attempting to use !!/watch-force to watch a multiline chat message breaks the watched_keywords.txt file by improperly inserting a newline.

Background information

There was a desire to modify a watch in CHQ, but the watch was longer than the limit for a 'normal' chat message. This isn't inherently an issue if it gets modified without the use of chat.

While multiline messages can get around that restriction, that also introduces a newline character (\n). From the regex perspective, that can be addressed (either by adding {0} directly after the newline or by starting a regex comment before the newline and ending it afterwards, as noted by @makyen).

However, it also breaks SmokeDetector as that inserts a newline directly into the watched_keywords.txt file, which breaks it. This was wisely foreseen by Makyen, hence not testing on production.

Details

  • For reference, here is the string I tested with. The newline (created with Shift + Enter) has been replaced with \n for this GitHub issue. 000000[trimmed for brevity]000000(?# foo\nbar)
  • SD can receive multiline chat messages, but at least in !!/bisect, it treats it as a \n. It just handles them problematically for watches (and presumably blacklists).
  • Attempting to watch a multiline string in chat, as I tried to here, breaks stuff. Specifically, it produced this as the end of watched_keywords.txt.
     65274  1723875289      cocomac 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000(?# foo
     65275  bar)
    
    And this console error:
    [06:19:34.736] watched_keywords.txt:65275:not enough values to unpack (expected 3, got 1)
      File "/home/ubuntu/SmokeDetector/blacklists.py", line 105, in parse
        when, by_whom, what = line.rstrip().split('\t')
        ^^^^^^^^^^^^^^^^^^^
    
    [06:19:40.636] Global blacklists loaded
    
  • Here is the relevant part of the console logs after attempting a bisect of some text that would match the regex.
    [06:21:42.072] Command received: !!/bisect 00000000000000000000000000000000000000000000000000000000000000000000000000000
    000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
    000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
    000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
    000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
    000000000000000000000000000000<span>&hellip;</span>
    [06:21:50.007] Attempted to send status ping but metasmoke_host is undefined. Not sent.
    [06:22:10.883] regex._regex_core.error: missing ) at position 975
    2024-08-17 06:22:10.878529 UTC
      File "/home/ubuntu/SmokeDetector/chatcommunicate.py", line 530, in __call__
        result = self.__func__(*processed_args)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1691, in bisect
        results_bookended = bisect_regex_in_n_size_chunks(64, s, bookended_regexes, bookend=True, timeout=timeout)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1624, in bisect_regex_in_n_size_chunks
        results = bisect_regex(test_text, chunk, bookend=bookend, timeout=timeout)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1573, in bisect_regex
        compiled = regex_compile_no_cache(formatted_regex, city=findspam.city_list, ignore_unused=True)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/ubuntu/SmokeDetector/helpers.py", line 491, in regex_compile_no_cache
        return regex_raw_compile(regex_text, flags, ignore_unused, kwargs, False)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.12/dist-packages/regex/regex.py", line 542, in _compile
        raise error(caught_exception.msg, caught_exception.pattern,
    
    
    
      File "/home/ubuntu/SmokeDetector/chatcommunicate.py", line 530, in __call__
        result = self.__func__(*processed_args)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1691, in bisect
        results_bookended = bisect_regex_in_n_size_chunks(64, s, bookended_regexes, bookend=True, timeout=timeout)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1624, in bisect_regex_in_n_size_chunks
        results = bisect_regex(test_text, chunk, bookend=bookend, timeout=timeout)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/ubuntu/SmokeDetector/chatcommands.py", line 1573, in bisect_regex
        compiled = regex_compile_no_cache(formatted_regex, city=findspam.city_list, ignore_unused=True)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/ubuntu/SmokeDetector/helpers.py", line 491, in regex_compile_no_cache
        return regex_raw_compile(regex_text, flags, ignore_unused, kwargs, False)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/local/lib/python3.12/dist-packages/regex/regex.py", line 542, in _compile
        raise error(caught_exception.msg, caught_exception.pattern,
    

What problem has occurred? What issues has it caused?

It broke. Details are above. As this was on a testing instance, it didn't impact the production instance.

On my testing instance, it broke the watched_keywords.txt file, which would need manual intervention to fix. When I tested locally, it failed to push the broken watched_keywords.txt file to GitHub, but I don't know if that's due to my testing setup or this bug. Various Git-related SD commands might be able to restore it to a functional state, although I did not test that, so I cannot confirm if that would be a remedy if this happened to a production SD instance.

What would you like to happen/not happen?

"like to happen"

  • Reject a multiline chat watch with a helpful message. Something like Multiline chat watches are not supported. Consider using Git (or pinging someone that can) as an alternative. might work*
  • Accept the multiline watch and correctly input it into the relevant file. This would also require deciding how to treat the newline (leave it as-is, remove the newline, remove everything after & including the first newline, etc.)

*I don't know what the preference is on the best way to request a manual Git-based change, so if that path is chosen, someone should probably confirm that wording.

"like to not happen"

It should not corrupt files.

CoconutMacaroon avatar Aug 17 '24 06:08 CoconutMacaroon