phishing_catcher
phishing_catcher copied to clipboard
Check domains with special characters
Hello, Your repo can help a lot of people, but you should check for domains that have special characters like the ỵ , ṙ . Let me know about updates!
It definitely could
The function score_domain can handle such domains as in python3 every strings are unicode.
Lookalike characters should definitely be scored as their normal counterparts in terms of looking at suspicious wording. Just need to translate punycode domains back to unicode, and determine which characters look like other ones.
The Unicode Consortium provides a list of confusables, and all that's needed from that are characters that are confusable with [a-zA-Z0-9\-]
Could we close @x0rz ?
Some issues with the current PR, will close as soon as it detects confusables