WhatsMyName icon indicating copy to clipboard operation
WhatsMyName copied to clipboard

Invalid char in username for some sites

Open enodr opened this issue 3 years ago • 9 comments

I have checked issue #55 and issue #430 but it looks like the problem of handling invalid chars like "." in usernames is done only for subdomains but not in the url.

One example: https://wanelo.co/john is valid but https://wanelo.co/john.doe is in fact the john account, thus john.doe is a false positive. The json data file should have an allowed/forbidden char list (both could be usefull). I'd be glad to contribute and add the changes to wmn-data.json if this trivial change is approved.

enodr avatar Feb 06 '23 14:02 enodr

Hmm. Thanks for pointing this out. Since it is only an issue on some entries and not others, it would probably need to be a per-entry parameter because globally escaping/replacing . from all usernames across all sites would most likely have false negative impacts.

Thoughts?

WebBreacher avatar Feb 06 '23 14:02 WebBreacher

Yes I was thinking about a per site / entry option. Options to discuss:

  • a good variable (key) name
  • regex or just plain chars

For now the false positives I have had it would be enough to deal with a simple option like: badchars: '.' on a per site basis.

Example:

       {
        "name" : "Wanelo",
        "uri_check" : "https://wanelo.co/{account}",
        "badchars" : ".",
        "post_body" : "",
        "e_code" : 200,
        "e_string" : "on Wanelo</title>",
        "m_string" : "Hmm, that's embarrassing",
        "m_code" : 404,
        "known" : ["lisandrareyes"],
        "cat" : "shopping",
        "valid" : true
       },

and code logic would be as simple as (python):

badchars = set(site["badchars"])
if any((c in account) for c in badchars): continue

enodr avatar Feb 06 '23 22:02 enodr

So, the . is really the only character that I've noticed causes us problems since usernames can be in the subdomains or as a parameter. I'm wondering if the parameter could just be a Boolean strip_bad_char with values of True or False. If true, then remove anything non-[a-zA-Z0-9].

Thoughts?

WebBreacher avatar Feb 12 '23 16:02 WebBreacher

Seems totally good!

enodr avatar Feb 16 '23 10:02 enodr

OK @enodr. I'll take this on to insert the strip_bad_char boolean

WebBreacher avatar Mar 05 '23 22:03 WebBreacher

@WebBreacher , we can make the field optional, such that if it is specified/provided the strip_bad_char method is called. That will save you the time of editing each entry in the wmn-data.json file.

yooper avatar Mar 05 '23 23:03 yooper

Agreed @yooper and, at the same time, we can do the same for the "post_body" : "", which is rarely used.

WebBreacher avatar Mar 05 '23 23:03 WebBreacher

I will put in sometime this week and make it happen.

yooper avatar Mar 05 '23 23:03 yooper

Hey thanks! I can mod the JSON if you wanna add the feature to the script.

WebBreacher avatar Mar 05 '23 23:03 WebBreacher

Going to close this as:

  1. We have the strip_bad_char schema addition
  2. The python code @yooper was going to mod has been removed from this project into its own project.

WebBreacher avatar Feb 09 '24 21:02 WebBreacher