CommonRegexRuby icon indicating copy to clipboard operation
CommonRegexRuby copied to clipboard

Ophilon issue4

Open ophilon opened this issue 4 years ago • 2 comments

hi Talysson and the team Sorry to bother with this issue - hopefully, subj. PR able to fix

ophilon avatar Nov 29 '21 10:11 ophilon

Hello, @ophilon. What are those listed options? Are those every TLDs that exist? If so, will we need a new PR every time that list is updated? This library does not have the intent to be a validator but a way to extract data from strings, I don't think it's safe to implement the domain regex so strictly.

talyssonoc avatar Nov 29 '21 19:11 talyssonoc

thanks for the comment Talysson. My name is Oleg, I'm from Belarus.

  1. I think the IANA.ORG did this big update of root zones 1st time since initial ~120 domains and that's the standard for long time. It seems you created this gem after this python one - commonregex python lib. You can check - python lib already uses more root zone names then original TLD and 2 letters country. I hope that this list of ~1400 will serve many many years. It's limited to Latin only names - see my comment in links_regex
  2. While rather long and complete subj. TLD list works effectively, all my checks run in milliseconds:
[ophil@philon-op test]$ ruby test_commonregex.rb 
Run options: --seed 48136
# Running:
.............
Finished in 0.002584s, 5031.2499 runs/s, 5031.2499 assertions/s.
13 runs, 13 assertions, 0 failures, 0 errors, 0 skips
  1. Me also changed expression into multi-line free-spacing mode - for me it looks and reads much better then old one. Please, note my comments - I explained every part of regex for clarity and easy support in the future.

ophilon avatar Nov 29 '21 19:11 ophilon