[Feed Issue]: Unable to bypass Cloudflare bot protection
Feed URL
https://forum.audiogames.net/posts_feed/rss/
Website URL
https://forum.audiogames.net
Problem Description
Miniflux is 403 redirected when it tries to access this feed. It's being blocked by Cloudflare, even with configuration settings changed.
Expected Behavior
After disabling the setting specified in https://github.com/miniflux/v2/issues/3140, the feed should (according to this issue) behave as expected.
Relevant Logs or Error Output
Last Parsing Error
Access to this website is forbidden. Perhaps, this website has a bot protection mechanism?
Additional Context
No response
Troubleshooting Steps
- [x] I have checked if the feed URL is correct and accessible in a web browser.
- [x] I have checked if the feed URL is correct and accessible with
curl. - [x] I have verified that the feed is valid using an RSS/Atom validator.
- [x] I have searched for existing issues to avoid duplicates.
This website seems to block everything that is not a web browser.
curl is also blocked:
curl -I https://forum.audiogames.net/posts_feed/rss/
HTTP/2 403
You can try the solution mentioned in this comment: https://github.com/miniflux/v2/issues/2266#issuecomment-2558436891
This proxy thingy from the link would be faster but unfortunatelly doesn't work anymore as the curl-impersonate repo and commit used are outdated (as new browser versions arrived) and abandoned since 2023.
So there's still the "just a moment..." interestial and the failure to fetch the feed. There's a more recent fork, tho.
Another option would be this other repo that forked curl-impersonate and is keeping the software up to date.
Considering all this, I'd like to suggest supporting FlareSolverr. It'd be a slower solution, but would work great with Docker and be able to click and bypass the Cloudflare wall. Another suggestion would be to incorporate curl-impersonate from GerHobbelt repo instead of "stock" curl.
I came across this article - How To Setup Miniflux and Flaresolverr.
Flask + FlareSolverr work on my end, i.e. the Cloudflare challenge is solved and I can get the feed content with curl, but I couldn't figure the rewrite rule part in Miniflux.
Using the rule mentioned in the article
rewrite("^https:\/\/domain\.tld(\/.*)?$"|"http://127.0.0.1:5000?url=https://domain.tld$1")
I'm still getting 403 error / "Access to this website is forbidden. Perhaps, this website has a bot protection mechanism?" message.
Maybe someone cleverer than me can get it working this way?
Also tried this curl-impersonate fork (the one used in Nixpkgs) but this didn't help. FlareSolverr seems to work better.
#3811