awesome-ruby icon indicating copy to clipboard operation
awesome-ruby copied to clipboard

Add domain_extractor - URL parsing gem

Open jordanhudgens opened this issue 1 month ago • 2 comments

Add domain_extractor - URL parsing gem

Project

  • GitHub: https://github.com/opensite-ai/domain_extractor
  • RubyGems: https://rubygems.org/gems/domain_extractor
  • Documentation: https://rubydoc.info/gems/domain_extractor
  • Dev.to introduction: https://dev.to/opensite/introducing-domainextractor-a-high-performance-ruby-gem-for-url-parsing-and-domain-extraction-3f4i
  • Project page: https://opensite.ai/developers

What is this Ruby project?

domain_extractor is a lightweight, production-ready Ruby gem for parsing URLs and extracting domain components—including support for multi-part TLDs (like .co.uk, .com.au, .gov.br). It delivers reliable domain, subdomain, root domain, and TLD extraction even from complex or deeply nested URLs. Built on top of Ruby's standard URI library and the public_suffix gem, it also provides features like:

  • Query parameter parsing
  • Smart URL normalization (handles missing schemes, ports, authentication, etc.)
  • Precise edge-case handling (returns nil for invalid URLs or IP addresses)
  • Thread safety and strong performance for analytics, scraping, and SEO workflows

What are the main differences between this Ruby project and similar ones?

  • Accurate multi-part TLD support: Uses the Public Suffix List for precise extraction; handles .co.uk, .com.au, .gov.br, and all recognized multi-part suffixes
  • Deep nested subdomain handling: Correctly extracts subdomains of any depth with no configuration required
  • Zero config, instant use: No setup or special methods needed; a single parse(url) handles almost any URL in the wild
  • Query parameter extraction: Built-in parsing to structured hashes for tracking/analytics uses
  • Batch mode: High-throughput batch parsing for handling large scraped datasets or logs
  • Well-documented and maintained: Active open-source project with docs, benchmark data, and real production use at OpenSite AI
  • Compared to:
    • psl-domain-extractor: More user-friendly API, better documentation, broader TLD handling, and more active development
    • root_domain: More features (subdomain extraction, query parsing), better performance, and documentation
    • URI/addressable (stdlib): These libraries don't handle multi-part TLDs or provide robust domain separation

domain_extractor is ideal for anyone needing robust domain handling in Ruby apps—particularly use cases involving international URLs, analytics pipelines, and modern SEO.


Please help us to maintain this collection by using reactions (👍, 👎) and comments to express your feelings.

jordanhudgens avatar Nov 09 '25 00:11 jordanhudgens

@jordanhudgens thanks for sharing! Unfortunately, this project doesn't meet our quality standards yet (only 2k downloads, 30k required).

markets avatar Dec 03 '25 09:12 markets

I missed that min requirement, sorry about that, I'll make a note wait until then, thanks!

jordanhudgens avatar Dec 03 '25 18:12 jordanhudgens