regexp-tree icon indicating copy to clipboard operation
regexp-tree copied to clipboard

certain meta characters shouldn't be allowed in char ranges

Open andruo11 opened this issue 4 years ago • 4 comments

When I parse the regex /[\w-z]/ it should throw an error, but instead parses as a regular character range with \w at the beginning. https://astexplorer.net/#/gist/124dd2c7d464e3cf68b532bf8dacae7f/01a86f60e792112367c9395cf1094b5806dcaab1 If I figure out how to fix it I'll let you know!

andruo11 avatar Jan 22 '21 21:01 andruo11

...and ditto for Unicode properties like /[\p{P}-z]/u

andruo11 avatar Jan 22 '21 21:01 andruo11

Thanks for the report, yeah, the /[\w-z]/.test('-') should actually be parsed as a char class containing \w, - and z.

DmitrySoshnikov avatar Jan 25 '21 22:01 DmitrySoshnikov

Is your reference another parser in the AST explorer, or JS itself? and if JS, how do you know the internal structure? My parents think I should learn more about this topic, and if I’m going to help out, I should probably figure out how to make a good pull request.

From: Dmitry Soshnikov [email protected] Sent: Monday, January 25, 2021 3:00 PM To: DmitrySoshnikov/regexp-tree [email protected] Cc: Andrew Levine [email protected]; Author [email protected] Subject: Re: [DmitrySoshnikov/regexp-tree] certain meta characters shouldn't be allowed in char ranges (#219)

Thanks for the report, yeah, the /[\w-z]/.test('-') should actually be parsed as a char class containing \w, - and z.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/DmitrySoshnikov/regexp-tree/issues/219#issuecomment-767167417 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGBMZIFFFRT3FO2OILXM2TS3XZVTANCNFSM4WPBGL7A . https://github.com/notifications/beacon/ABGBMZODFVHQIWVZUZKIHI3S3XZVTA5CNFSM4WPBGL7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOFW5AXOI.gif

andruo11 avatar Jan 25 '21 23:01 andruo11

Seems /[\p{P}-z]/u is at least throwing now even while /[\w-z]/ isn't.

brettz9 avatar Feb 25 '21 05:02 brettz9