rss-parrot icon indicating copy to clipboard operation
rss-parrot copied to clipboard

RSS access fails with URL containing query-args

Open ilkka-ollakka opened this issue 9 months ago • 2 comments

RSS feed reading fails for url https://helmet.finna.fi/Search/Results?filter%5B%5D=%7Eformat%3A%220%2FBook%2F%22&filter%5B%5D=%7Elanguage%3A%22eng%22&filter%5B%5D=%7Emajor_genre_str_mv%3A%22fiction%22&filter%5B%5D=first_indexed%3A%22%5BNOW-6MONTHS%2FDAY+TO+%2A%5D%22&lookfor=%28fantasiakirjallisuus+OR+tieteiskirjallisuus%29+AND+NOT+%22star+wars%22+AND+NOT+sarjakuvat+AND+NOT+lastenkirjallisuus&type=AllFields&view=rss

Seems that cloudflare gives 403 on rss-parrot access. Python side issue was with https ALP fix in aiohttp 3.11.11 in similar problem (https://github.com/aio-libs/aiohttp/pull/10156). So maybe some dependency/tls related thing needs to be pumped up?

ilkka-ollakka avatar Apr 07 '25 09:04 ilkka-ollakka

I asked Finna colleagues to look at this.

The problem seems to be that RSS Parrot drops all the URL parameters from the request URL. So something like the above URL ends up being requested from Finna like this:

GET /Search/Results HTTP/1.1

This is obviously not going to work, and Finna will probably respond with a 403 in this case.

osma avatar Apr 09 '25 10:04 osma

Updated issue title to reflect more likely root-cause of query arg dropping.

ilkka-ollakka avatar Apr 15 '25 07:04 ilkka-ollakka