confusables icon indicating copy to clipboard operation
confusables copied to clipboard

Regex meaningful characters not escaped in confusable_regex

Open matthew-robertson opened this issue 5 years ago • 1 comments

When trying to create a confusable regex from a string like "t[e]st", it will fail to match the string "t[e]st". I believe this is because the characters are not escaped when added to the regex.

matthew-robertson avatar Nov 11 '19 00:11 matthew-robertson

Old post, but replying for anyone stumbling across this. One possible approach is to add more padding characters (characters that may be inbetween searched characters), so that instead of having to search for t[e]st, you can just search for 'test'. But brackets are not included as padded/buffer characters. What I did to change / add more was:

bufferMatch, addBuffers = "*_~|`", "*_~|`\[\]\(\)'"
expression = confusable_regex(whatever, include_character_padding=True).replace(bufferMatch, addBuffers)

This takes the regex expression, and replaces any parts with the original set of padding characters, and replaces it with my own, escaped properly. Any can be added.

In your example, it would allow "t[e]st" to be matched with a search for "test"

ThioJoe avatar Dec 27 '21 20:12 ThioJoe