pyWhat
pyWhat copied to clipboard
Add support for list of regex in regex.json
Is your feature request related to a problem? Please describe.
Many regexes (like phone numbers or coordinates) have several formats. Currently, |
can be used to separate formats, but it makes regex longer and harder to understand.
Describe the solution you'd like "Regex" should be a list of regexes.
Hi, I would like to know about this issue in detail. Do you want multiple regex patterns separated by vertical pipe ( | ), or do you want any change in the code to support a list of regex patterns?
I can solve this issue by creating following blocks with different regex pattern for different type of phone Number Patterns.
"Name": "Phone Number",
"Regex": "^(\\s*(?:\\+?(\\d{1,3}))?[-. (]*(\\d{3})[-. )]*(\\d{3})[-. ]*(\\d{4})(?: *x(\\d+))?\\s*)$",
"plural_name": false,
"Description": null,
"Rarity": 0.5,
"URL": null,
"Tags": [
"Identifiers",
"Credentials",
"Phone Number",
"Phone"
],
"Children": {
"path": "phone_codes.json",
"entry": "Location(s): ",
"method": "hashmap"
},
"Examples": {
"Valid": [
"202-555-0178",
"+1-202-555-0156",
"+662025550156",
"+356 202 555 0156"
],
"Invalid": []
}
}```
I would like to have a list of regexp, so yes, changes in code are needed. Pywhat should concatenate all these regexes into one using pipes(|). Therefore, only regex loader function should be updated.
Hey, I am able to combine regex patterns dynamically using |
. What is your suggestion for combining other parameters of the block, such as Description, Rarity, URL, Tags, Children, and Examples.
Hi! I will take our THM flag regex to better show what we want to achieve. (The only part changed is Regex
)
We want to separate multiple regex formats of one identifier into a list of regexes, instead of having all of them together and separated by |
. Some of regexes are quite long and seeing where is what is getting harder, so our goal is to improve their readability.
I am not sure how PyWhat should use them after tho. Merged them back into one with |
or search with each regex separately. Probably the latter.
I hope this clears some things. Maybe @bee-san has some more suggestions. :)
{
"Name": "TryHackMe Flag Format",
"Regex": ["(?i)^thm{.*}$", "(?i)^tryhackme{.*}$"],
"plural_name": false,
"Description": "Used for Capture The Flags at https://tryhackme.com",
"Rarity": 1,
"URL": null,
"Tags": [
"CTF Flag"
],
"Examples": {
"Valid": [
"thm{hello}"
],
"Invalid": []
}
}