confusables
confusables copied to clipboard
Fix issues with long strings
confusables.normalize()
has issues with long strings, especially numbers. It creates large lists that consume a great deal of memory (froze my pc lol). These changes allow it to be more tolerant of long strings by ignoring alphanumeric ascii characters. I've also done some minor code cleanup.
The testcases I noticed to be causing this issue were discord links, eg:
https://discord.com/channels/00000000000000000/00000000000000000/00000000000000000
https://cdn.discordapp.com/attachments/00000000000000000/00000000000000000/video0.mp4
The code in #36, or a version of it, might also assist with this issue.