videocardz.com - unable to decode content correctly
TL; DR
Videocardz.com has to use specific setting to bypass cloudflare, but either these changes or the sites content isn't displayed correctly.
Setup
When trying to add the feed https://videocardz.com/rss-feed, I'm unable to do so without additional options. I believe this is because the feed/website is placed behind Cloudflare. To get this to work I need to both 'Disable HTTP/2 to avoid fingerprinting' and 'Override Default User Agent':
General
- Feed URL:
https://videocardz.com/rss-feed
Network Settings
- Override Default User Agent:
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.5938.132 Safari/537.36 - [x] Fetch original content
- [x] Disable HTTP/2 to avoid fingerprinting
Rules
- Scraper Rules:
#videocardz-article - Rewrite Rules:
remove("div.socialbar")
Problem
However, the content returned is not correctly displayed:
-
Miniflux article:
becomes Â, ’ becomes ’ and many more issues seemingly to do with UTF-8.
Note
When searching for options that could help with decoding the content, I applied the rewrite rule base64_decode and it caused this error:
Database error: store: unable to create entry "https://videocardz.com/newz/intel-confirms-5th-gen-npu-for-panther-lake" (feed # 27): pq: invalid byte sequence for encoding "UTF8": 0xf5 0x39 0x3c 0x2f.
It looks like an encoding issue on their side to be honest.
I'm not able to bypass the Cloudflare bot protection for this website even with the workarounds mentioned above.
I did another test with Miniflux 2.2.11 and it's working fine for me.
Give it another try with the latest version of Miniflux.