Boop icon indicating copy to clipboard operation
Boop copied to clipboard

Add option to Extract domain from url

Open uhlhosting opened this issue 4 years ago • 5 comments

This is a very usefull feature to have, would be great if someone can add this to Boop. Amazing piece of software! Thank you.

uhlhosting avatar Aug 21 '20 15:08 uhlhosting

Hey there!

Thanks for the suggestion, could you post an example of input and the expected output?

IvanMathy avatar Aug 23 '20 19:08 IvanMathy

Sure here it is:

http://drdarrenchua.com/palo-santo-smudging-bursera-graveolens-sticks-uses
http://mddsgj886.com/comment/html/?22258.html
http://tlyl520.com/comment/html/?26253.html
http://vipwww.vip/comment/html/?99879.html
http://web.ubc.co.kr/board/index.php?mid=goodgood_Market&document_srl=492921
http://www.garagenik.co.il/palo-santo-smudging-bursera-graveolens-sticks-uses-0
http://www.jmcyzxy.com/comment/html/?323421.html
https://advokatsovet.ru/search/Palo+Santo+Smudging+Bursera+Graveolens+Sticks+Uses
https://ikman.us/user/profile/31131

would become , it keeps only the FQDN without www. or http:// https://

drdarrenchua.com
mddsgj886.com
tlyl520.com
vipwww.vip
web.ubc.co.kr
www.garagenik.co.in
www.jmcyzxy.com
advokatsovet.ru
ikman.us

Like google states:

The following domains are invalid: https://xxx.ssd.ss. A valid domain requires a host and must not include any path, port, query or fragment.

uhlhosting avatar Aug 23 '20 19:08 uhlhosting

Some details here:

https://stackoverflow.com/questions/8498592/extract-hostname-name-from-string

uhlhosting avatar Aug 28 '20 00:08 uhlhosting

it keeps only the FQDN without www. or http:// https://

@uhlhosting Your examples and your specification are in conflict. You asked for a function that removes www but your example output includes several domains (eg. www.garagenik.co.in) that include www. Are you looking for a function that returns the entire hostname (eg. web.ubc.co.kr) or just the registered domain portion without any subdomains (ie. ubc.co.kr)?

I see three potentially useful scripts here:

  1. Extract Hostname from URL (http://a.b.example.com. -> a.b.example.com.)
  2. Remove Subdomain (a.b.example.com. -> example.com.)
  3. Extract Domain Suffix (a.b.example.com. -> com.)

nhnicwaller avatar Sep 20 '20 17:09 nhnicwaller

Hi, I have created a PR https://github.com/IvanMathy/Boop/pull/310 for the issue. Please let me know your comments. Thanks :)

SusilRamarao avatar Oct 31 '21 10:10 SusilRamarao