rss-bridge icon indicating copy to clipboard operation
rss-bridge copied to clipboard

JustWatch Bridge misses some titles

Open passthesh3ll opened this issue 1 year ago • 9 comments

Describe the bug After some hours that a title is posted in justwatch website, the title is not posted in justwatch bridge rss.

To Reproduce Steps to reproduce the behavior:

  1. Go to JustWatch Bridge
  2. Click on Country:"Italy" and Type:"All"
  3. Go to JustWatch Wesbsite
  4. Find a missing title between the latest titles

Expected behavior All titles of justwatch website should be posted in justwatch rss

Screenshots image

Desktop (please complete the following information):

  • OS: Archlinux
  • Browser: Brave
  • Version: 1.62.156 Chromium: 121.0.6167.139 (Official Build) (a 64 bit)

Additional context n/a

passthesh3ll avatar Feb 04 '24 09:02 passthesh3ll

@Bockiii

dvikan avatar Feb 04 '24 14:02 dvikan

I just checked and there seems to be some size or amount limitations. The problem is that I didn't add a filter for all the providers (because I didnt bother to add another 70 selection options to the filter and also I think the list is very fluctuating and could produce unwanted behavior if you select a country/provider combination that doesnt exist (because BBC isnt available in guana or something like that).

This leads to the problem that even the "Today" field contains hundreds and hundreds of entries.

Second problem seems to be the "scroll to right" function on providers. So if amazon releases 140 entries on a day, I only get 10 because for the others to load, we would have to scroll. @dvikan do you know of a way how to deal with this in php?

At this point, I don't really know how to fix this because of the limitations above.

Bockiii avatar Feb 04 '24 15:02 Bockiii

image

Example for the side scrolling problem. Page says 18 new titles, only displays 10 and thus, only 10 are in source.

Bockiii avatar Feb 04 '24 15:02 Bockiii

Maybe it could be useful a filter with just the 4-5 major providers: netflix, prime, disney, apple, paramount, crunchyroll. That are what ~99% of users need.

passthesh3ll avatar Feb 04 '24 16:02 passthesh3ll

Alternative would be a free text field with ab explanation of what to put in there. You can define the providers and then copy paste them from the url.

Full alternative would be to just change the whole bridge to just picking up a provided link and then just scraping from there. Although that would also not help with the people that just pastel the link to the full "new" page.

Not sure on how to proceed

Bockiii avatar Feb 04 '24 17:02 Bockiii

I found that justwatch has a graphQL api but its not really documented. I found these sources for some information, if someone wants to take a swing:

https://www.reddit.com/r/webscraping/comments/wacb8a/how_to_scrape_graphql_endpoint_with_requests/

Bockiii avatar Feb 05 '24 15:02 Bockiii

I've been looking at this again and getting "all" just doesn't make any sense. Just today there were these additions: "Amazon Vault History Channel": 341 "History Vault Apple TV": 84 "Acorn TV Apple TV": 95 "Microsoft Store": 99 "Eventive": 170

Those 5 alone mean 789 rss feed entries, just for today.

How about this: I'll add the top 10/top 15 of providers and if someone actually requests "Eventive", we can check this again?

Bockiii avatar Feb 07 '24 21:02 Bockiii

Ah forget it, even that is annoying af. Paramount+ is available as "Paramount+", "Amazon Channels Paramount+", "Apple TV Channels Paramount+" and so on. And that's just for the US page, so for italy, there could be either none of those or a fourth "Channel" type etc.

This page just doesn't really fit the "select from a list" type of bridge. Maybe a "paste the link you want" would be better or maybe even removing it completely and replacing it by an xpath bridge how-to or so...

Bockiii avatar Feb 07 '24 21:02 Bockiii

unclear to me the solution here.

dvikan avatar Mar 31 '24 02:03 dvikan