subscraper
subscraper copied to clipboard
Output file having unwanted js files along with subdomains.
When i run python3 subscraper.py -u youtube.com -v
Subdomains Found:
[+] www.youtube.com
[+] youtubei.youtube.com
[+] payments.youtube.com
etc
When running with -o flag output carries js files: python3 subscraper.py -u youtube.com -o out.txt
Result:
www.youtube.com web-animations-next-lite.min youtubei.youtube.com payments.youtube.com tv.youtube.com music.youtube.com creatoracademy.youtube.com artists.youtube.com www.google-analytics.com www.gstatic.com detect.min.js ajax.googleapis.com index.min.js polyfills.js webcomponents-lite.js upload.youtube.com s.ytimg.com music_polymer_inlined_html.js gstatic.com m.youtube.com web-release-qa.youtube.com tv-release-qa.youtube.com web-green-qa.youtube.com tv-green-qa.youtube.com custom-elements-es5-adapter.js webcomponents-sd.js scheduler.js www-tampering.js www-prepopulator.js spf.js network.js www-i18n-constants.js consent.youtube.com consent-daily-0.sandbox.youtube.com consent-daily-1.sandbox.youtube.com consent-daily-2.sandbox.youtube.com consent-daily-3.sandbox.youtube.com consent-daily-4.sandbox.youtube.com consent-daily-5.sandbox.youtube.com consent-daily-6.sandbox.youtube.com consent-autopush.sandbox.youtube.com daily-0.consent.corp.youtube.com daily-1.consent.corp.youtube.com daily-2.consent.corp.youtube.com daily-3.consent.corp.youtube.com daily-4.consent.corp.youtube.com daily-5.consent.corp.youtube.com daily-6.consent.corp.youtube.com autopush.consent.corp.youtube.com dev.consent.corp.youtube.com desktop_polymer_inlined_html_polymer_flags.js
And the same output isn't got when in verbose mode, I assume?
It's interesting because it comes from the same variable SUBDOMAINS_ENUMERATED
which is used for both verbose and output to text file modes.
https://github.com/Cillian-Collins/subscraper/blob/f3ccbfafe645feede1422b9e493bd6b4cc041da5/subscraper.py#L162
I'll try to find the time to debug it soon.