monolith icon indicating copy to clipboard operation
monolith copied to clipboard

Cloudflare DDOS protection

Open Tragen opened this issue 5 years ago • 7 comments

With Cloudflare DDOS protected sites, I only get this screenshot.

image

What can I do about that? Could you wait for one redirect or 10 seconds and reload the page?

Tragen avatar Mar 26 '20 10:03 Tragen

Hi there @Tragen, thank you for reporting this!

I wasn't aware of Cloudflare getting in the way of monolith, but it's only logical, since I know scrapers often bump into that page. Monolith doesn't act as a browser on its own, it rather operates by pulling assents directly (the same way curl or wget does). I'm fairly certain Cloudflare uses JS to redirect from that page, hence it's unlikely to be possible to catch that for monolith even if I make it wait 10 seconds... Cloudflare could potentially be setting cookies after "Checking your browser", those cookies could be extracted and supplied to monolith (once I implement support for said cookies, which is one of my next goals after the upcoming 2.2.0 release is out). I don't think setting user-agent would help. You could try that browser plugin SinglePage, it should save whatever you have in the DOM after the redirect, and do everything what monolith does (embed assets, resolve hrefs, etc).

Related: https://github.com/Anorov/cloudflare-scrape

snshn avatar Mar 27 '20 06:03 snshn

Thanks for your response. I'm using SinglePage since months. ;) I also don't have a solution how it could work. Extracting cookies sounds like more work than it's worth but I would give it a try.

Tragen avatar Mar 27 '20 10:03 Tragen

I'd try saving the page locally using your browser as .html + assets, and then using monolith on that -- the code in master works with file://// (e.g. monolith local.html -o local-monolithic.html). I'm currently preparing the aforementioned 2.2.0 release which includes that functionality.

snshn avatar Mar 27 '20 12:03 snshn

Great to hear. That sounds like combined solution with wget or curl to spider a page. SinglePage does that already. I will see if I can get it running automatically.

Can you also provide a windows build?

Tragen avatar Mar 27 '20 14:03 Tragen

As my next task for the project, I will set up the CI/CD to add builds automatically for all 3 OSes upon creating a release. ...after I'm finished with the current CSS work for 2.2.1 in #140

snshn avatar Mar 30 '20 07:03 snshn

@Tragen you should be able to grab the windows binary from every release page, the latest as of now is this one: https://github.com/Y2Z/monolith/releases/tag/v2.2.3

snshn avatar Apr 13 '20 06:04 snshn

Hi @Tragen,

please give this a try:

chromium --headless --disable-gpu --dump-dom https://your-cloudflare-protected-url | monolith - -b https://your-cloudflare-protected-url -o test-cloudflare.html

snshn avatar Feb 18 '22 07:02 snshn

Instructions for dealing with JavaScript (including CloudFlare's check) can be found in the latest README.md.

snshn avatar Nov 10 '22 16:11 snshn