plugin.video.iplayerwww icon indicating copy to clipboard operation
plugin.video.iplayerwww copied to clipboard

BDCiD - Added

Open dimkroon opened this issue 1 year ago • 8 comments

Re-implementation of ListFavourites() with the new feature of a context menu to remove a programme from 'Added'.

Currently ListFavourites() fails without error, resulting in 'Added' being an empty list. It looks like 'Added' suffered from the same change 'Watching' had recently.

Although it look like you can 'add' an episode on iplayer website, Added' is in fact lists of programmes. The addon presents all items as programme folder, unless the programme has only one episode, like a film. Which is pretty close to, if not the same as it was.

New is the context menu to remove the item from the 'Added' list.

To be done: To get a proper implementation of this 'Added' feature, user's should have the possibility to add items to 'Added'. This would involve adding a context menu entry 'add programme' to all items in all listings that should support such an operation. However, if a programme is already on the 'Added' list, the context menu should an entry 'Remove' instead. All in line with current website behaviour and general user expectations.

For this to work we need:

  1. The list of programme ID's that are currently on the 'Added' list.
  2. Find a way to obtain programme ID's from all programmes in normal listings, and optionally for all series and episodes.
  3. Create a context menu entry to add/remove from 'Added' to all items that meet the requirements.

1) Obtaining and caching added programme ID's. In order to be able to know which context menu entry is to be created, the addon must know which programmes are already added before creating listings. Currently the addon is stateless. It is, of course, not feasible to request the added list each time before a page is requested, so some kind of caching should be implemented and I see several ways to do that.

  • Enable reuselanguageinvoker and simply keep the list if programmeId's in memory. That's is what I did in viwX to solve the same problem and that works fine. But enabling reuselanguageinvoker will effect other parts of the addon and that must be properly checked.
  • Store data as property of a global window. I've seen this suggested on the forum several times, but it just feels very hacky to me. Still, it's easy to implement and data is available for as long Kodi is alive. Maybe it's not so bad an idea after all.
  • Write and read to file, e.g. as JSON string. Involves frequent writes and might be relatively slow, in particular on systems with SD cards.
  • Use an addon like SimpleCache. With the overhead that of that it is, I think, only interesting if there is a lot more to cache. Which is something to consider BTW. All these options are easy to integrate, it's more a matter of indecision on my part as to which route to take.

2) Get programme ID's for all items in normal listings. This is where I definitely could use some help. At least all programme folders in all lists should have a context menu to add/remove them from 'Added'. I don't think users should be able to add individual episodes, as that gives the impression that you add the actual episode while in fact the whole programme is added. Unless it's a programme with a single episode of course. I guess that the programme ID is available in most, if not all, data, so the parser should be able to extract that. The big question is: where and for what kind of list is programme data extracted in ParseJSON(...) and ParseSingleJSON(...)?

3) Create the context menu. The appropriate place, I guess, is to handle it in AddMenuEntry(...) . If and which context menu to add can easily be established by checking the mode and the presence of a programmeID in the arguments.

Well, this is in all a much bigger story than the actual problem, but there you are. I'm very interested in your views on 1) and would very much appreciate any help or guidance on 2). And other remarks or ideas are very welcome too of course.

dimkroon avatar Jan 02 '24 21:01 dimkroon

I am starting to get what you are trying to achieve.

Regarding caching there must be a better way. How does the website/browser “know” which programme has been added and which has not? My bet is, that the state is an extra field in the JSON that is parsed for every page. If I am right, we just need to extend the JSON parser to recognize and store this field. Let’s analyze first how the website and a browser interact for added programmes. I never looked into this because I didn’t know how to implement context menus in add-ons.

As a general recommendation: we should try to do things as close as possible to a regular browser. This will dramatically reduce chances of things breaking.

CaptainTK avatar Jan 03 '24 07:01 CaptainTK

I had checked, and should have explained. Unfortunately here is no data on the page itself. On the iplayer website adding a programme is only possible on an episode page. Right after that page has loaded the script on that page requests https://user.ibl.api.bbc.co.uk/ibl/v1/user/adds/{PROGRAMME ID}, which returns status 200 when the programme is already added and 404 when not.

So, the website evaluates after the user has selected the programme, i.e. episode. This is practically not possible in the add-on, it must know which programmes are on the 'Added' list before list items are created. To prevent an additional web request each time a user selects an item caching is kinda unavoidable. I don't feel there's anything wrong with caching. Quite the opposite actually.

As an alternative we could create a context menu 'Remove' on all items in the 'Added' list, and a context menu 'Add' on items in normal listings. This way there's no need to compare programme ID's, so no need to cache anything, but programs that are already on the 'Added' list get the 'Add' context menu just the same, which is rather odd. The BBC has no problem with that, but I don't think it's worth sacrificing user experience just to prevent caching.

Regarding keeping things close to regular website behaviour, I totally agree. Maybe not so much because of stability, but if something breaks it's much easier to adapt if you can verify how the website has implemented the changes. In that respect I think it's perfectly OK to use the data from the 'Added' page for this purpose.

dimkroon avatar Jan 04 '24 02:01 dimkroon

OK, I see what you are getting at. You are trying to implement something in the add-on, that goes beyond what the website offers.

I don’t believe caching is the right answer to that. How often would the cache be refreshed? I am thinking of scenarios in which users work on multiple devices. For instance, they could add some programmes using the official app on their phone on the way home, then switch on their TV with a Kodi instance that is running 24/7 and continue selecting shows there.

I don’t think you can make this perfect without reloading the list every time. However, that would create a load on the Beeb‘s side that they would mind for sure.

So user experience will not be perfect anyways. The question is: What is better, having a cache that is sometimes wrong or having no cache and a message if programmes are added again? I would argue the latter has a smaller error potential and should be prefered.

CaptainTK avatar Jan 05 '24 07:01 CaptainTK

2) Get programme ID's for all items in normal listings. This is where I definitely could use some help. At least all programme folders in all lists should have a context menu to add/remove them from 'Added'. I don't think users should be able to add individual episodes, as that gives the impression that you add the actual episode while in fact the whole programme is added. Unless it's a programme with a single episode of course. I guess that the programme ID is available in most, if not all, data, so the parser should be able to extract that. The big question is: where and for what kind of list is programme data extracted in ParseJSON(...) and ParseSingleJSON(...)?

This should be quite easy IMHO. If you want to add a whole series, the ID appears to be the ID of the master folder, e.g.

  • b006t1p7 for https://www.bbc.co.uk/iplayer/episodes/b006t1p7/waterloo-road
  • b006q2x0 for https://www.bbc.co.uk/iplayer/episodes/b006q2x0/doctor-who

These should be equivalent to the episodes_url in ParseSingleJSON. The ID is already extracted, but not stored as it was not necessary.

Whereas individual episodes have their own IDs:

  • b01mfwbx for https://www.bbc.co.uk/iplayer/episode/b01mfwbx/waterloo-road-series-8-episode-2
  • b04grqgx for https://www.bbc.co.uk/iplayer/episode/b04grqgx/doctor-who-series-8-2-into-the-dalek

These should be equivalent to main_url in ParseSingleJSON. Again, the ID is already extracted, but not stored as it was not necessary.

In essence, the data is already there and just needs to be returned. Most likely it will be best to change the whole mechanism, and not return a URL any more, but just the ID and a flag what kind of URL should be constructed from it. Then, the ID can be used both for setting up a new directory (or playable), and for using it in the context menu to get added.

By the way, this is (IMHO) the ultimate argument against caching the added state in the add-on: If you add a whole series, you would need to check all episodes contained in the series and also add them to the cached list of episodes already added. It will just never work out right. So my suggestion is this:

  1. We do not implement a cache.
  2. The "Add" context menu is added for all non-live [sic] programmes, no matter if they have already been added or not.
  3. When trying to add, the user should get feedback in the form of a pop-up/error message saying either "Successfully added to your list" or "Item was already added or cannot be added".

At least this behaviour will be fully consistent, as it checks the success for every instance of adding a programme and return an immediate success/fail result.

CaptainTK avatar Jan 07 '24 08:01 CaptainTK

You are trying to implement something in the add-on, that goes beyond what the website offers.

I fear I've not explained very well. On the website user's can either add to or remove from 'Added', depending on whether the programme is already on the 'Added' list. My aim is to implement something similar in Kodi. It's is just because of the way Kodi works that it cannot be implemented in exactly the same way as the website does, but I don't think it's going beyond what the website does.

A cache expiring after, let's say, 5 minutes, it would quite adequately address the majority of issues like the one you mentioned and ensure that the add-on remains as responsive as usual - someone continuously navigating the add-on would only experience a tiny delay in page load once every five minutes. It still leaves a gap of 5 minutes (or whatever period you choose), of course. However, I would argue that a person who has added a programme on another device less than 5 minutes ago, now navigates in Kodi to the very programme he has just added and tries to add it again, is not using it in an normal way, but is rather testing the behaviour of the add-on.

It's important to note that I'm trying to get the best user experience, not to avoid errors interacting with BBC. It's no problem to add the same programme many times. Each time the BBC happily returns exactly the same success response. In my opinion it's poor user experience to first offer users the option to add a programme and when they actually do say; "oh sorry, you can't". The fact that there might be a remote possibility that the user is presented with the wrong option is for me no reason to try to avoid it as much as possible.

The question is: What is better, having a cache that is sometimes wrong or having no cache and a message if programmes are added again?

These are not mutually exclusive. It is no problem to show an error message in the rare event the cache is 'wrong'. However, if both the user and the add-on don't know the programme is already on the list, I would suggest to just complete the operation without error. The user experiences nothing unexpected and everything is up to date from then on.

By the way, this is (IMHO) the ultimate argument against caching the added state in the add-on: If you add a whole series, you would need to check all episodes contained in the series and also add them to the cached list of episodes already added"

That's not entirely a correct assessment. It is only possible to add programmeIDs, episodeID's are not allowed. Athough the 'Add' buttons are on an episode pages, but the browser does send the ProgrammeID when the button is pressed. So for all episodes we must somehow obtain the programmeID in the parser, but that does not affect the cache. I see it more as an argument for caching, actually; once one a programme is added the user is presented with a 'Remove' option in all related episodes and as such is informed that adding another episode is not possible. All very much like the web site.

Since it's pretty easy to implement I would like to suggest to just write the implementation, so you see and check how that works in practice. And if you still object it's only a small change to the alternative.

Back to business:

In essence, the data is already there and just needs to be returned. Most likely it will be best to change the whole mechanism, and not return a URL any more, but just the ID and a flag what kind of URL should be constructed from it. Then, the ID can be used both for setting up a new directory (or playable), and for using it in the context menu to get added."

Yes, I see. However, there are more items like groups and series folders that use the same callback function with different URLs, I believe. That's all together a bit of rework that has to be done very carefully, but is rather out of scope of this PR, I feel.

My main problem is that the parsers are very complex and I don't have a clue which piece of code is responsible for which piece of json data, in what situation. It would help enormously if you could give me some hints as to what type of page and what type of data is parsed where, so I can create tests and figure out where in that data a programmaID can be found.

From you post I gather:

  • If I check the ParseJson() for code that generates urls like https://www.bbc.co.uk/iplayer/episodes/xxx, will I find all pieces of code that produce programme folders?
  • ParseSingleJson() exlusively produces episodes, i.e playables. Or can ParseJson() produce playables too?
  • The associated programmeID should be available from the part of ParseJson() that called ParseSingleJson().

Would that be right, or is there more to it?

dimkroon avatar Jan 15 '24 06:01 dimkroon

I fear I've not explained very well. On the website user's can either add to or remove from 'Added', depending on whether the programme is already on the 'Added' list. My aim is to implement something similar in Kodi. It's is just because of the way Kodi works that it cannot be implemented in exactly the same way as the website does, but I don't think it's going beyond what the website does.

But it does. The add-on scrapes overview pages on which users cannot add or remove the programme. Take "Most Popular" as an example: https://www.bbc.co.uk/iplayer/most-popular. There is not a single button to add or remove. The same applies to A-Z or Categories. Only once you go for a specific episode, e.g. Eastenders (https://www.bbc.co.uk/iplayer/episode/m001vnlc/eastenders-25012024) you get the button to add/remove this episode, but not any others that are shown below. Again, as a website user, you would need to select each episode, then click add/remove for this episode. The add-on will create playables for all of these episodes. So while the website just checks the added status for the current episode, the add-on lists all of them. If you want to enable add/remove for all lists in the add-on, it is a functionality that is well beyond what is available the website.

These are not mutually exclusive. It is no problem to show an error message in the rare event the cache is 'wrong'. However, if both the user and the add-on don't know the programme is already on the list, I would suggest to just complete the operation without error. The user experiences nothing unexpected and everything is up to date from then on.

Well then why have a cache at all? There seems to be no harm in allowing users to add the same episode multiple times. Downside: Removal would only be possible from the "Added" list, not from anywhere. But again, it is not possible on the website either to remove episodes from a list. You would need to select each episode separately to remove it. Except for "Added", there is no list on the website which allows you to delete multiple episodes from the list, but you try to implement that in the add-on.

That's not entirely a correct assessment. It is only possible to add programmeIDs, episodeID's are not allowed.

I think we either see different websites or we need to clarify what a programmID vs. an episodeID is.

Take https://www.bbc.co.uk/iplayer/episode/m001ttzy/waterloo-road-series-13-episode-8 as an example: the code for the "Add+" button adds the (single) episodeID m001ttzy.

Take https://www.bbc.co.uk/iplayer/episodes/b006t1p7/waterloo-road as another example: the coder for the "Add+" button adds the (multiple episodes) programmeID b006t1p7.

So as far as I can see both episodeIDs and programmeIDs can be added and they have a very different meaning.

My main problem is that the parsers are very complex and I don't have a clue which piece of code is responsible for which piece of json data, in what situation. It would help enormously if you could give me some hints as to what type of page and what type of data is parsed where, so I can create tests and figure out where in that data a programmaID can be found.

As I explained before, the IDs used in ParseSingleJSON are what you are looking for. It can be episodeIDs or programmeIDs. Whenever you see main_url = 'https://www.bbc.co.uk/iplayer/episode/' + item.get('id') the ID added (item.get('id') is an episodeID. Whenever you see main_url = 'https://www.bbc.co.uk/iplayer/episodes/' + item.get('id') the ID added (item.get('id') is a programmeID.

If I check the ParseJson() for code that generates urls like https://www.bbc.co.uk/iplayer/episodes/xxx, will I find all pieces of code that produce programme folders?

No, you will get folders and playables. A playable in Kodi is somewhat (but not completely) equivalent to an episode page on the website. The concepts are just very different.

ParseSingleJson() exlusively produces episodes, i.e playables.

No, it can also produce folders, see above.

Or can ParseJson() produce playables too?

No, it cannot, and as far as I can see none of the folders it adds qualify for being added or removed. They don't appear to be "episodes", but rather special things like groups/collections.

Actually playables vs. folders is quite easy: All playables are created using CheckAutoplay. All folders are created using AddMenuEntry.

The associated programmeID should be available from the part of ParseJson() that called ParseSingleJson().

Again, it can be programmeIDs or episodeIDs, and both can be added (or removed).

CaptainTK avatar Jan 27 '24 14:01 CaptainTK

But it does. ... So while the website just checks the added status for the current episode, the add-on lists all of them. If you want to enable add/remove for all lists in the add-on, it is a functionality that is well beyond what is available the website.

The intended functionality is to provide users the option to add or remove items. In Kodi we do not have a dedicated episode page, so we simply cannot do it in the same way the website does. The only thing we got in Kodi are listings and that the only place possible to add such functionality. Unless you want to create an extra page before you play an episode, just to present the option to add/remove, but that will be very poor user experience. And we don't have buttons either, so in Kodi it's done by context menu. If you insist that implementing a functionality, similar to the website, but in a Kodi specific way is going beyond what the website does, then we just have different views on the concept of 'going beyond'.

Well then why have a cache at all?

Again, because of a decent user experience in general - not to present user with the option to add a programme, when it is already added. And to align user experience with that on the website - user get the option to remove when a programme is already added.

Except for "Added", there is no list on the website which allows you to delete multiple episodes from the list, but you try to implement that in the add-on.

Again, because in Kodi is not a website with a dedicated episode page; there is no other place to do it. See above.

Take https://www.bbc.co.uk/iplayer/episode/m001ttzy/waterloo-road-series-13-episode-8 as an example: the code for the "Add+" button adds the (single) episodeID m001ttzy.

Trusting that this is not just a bold assumption, but has actually been tested, I can only conclude that we are indeed seeing different websites. If I press the "+ADD'' button on that page, I find it results in a POST request to https://user.ibl.api.bbc.co.uk/ibl/v1/user/adds with {"id": "b006t1p7"} in the body. This ID is the programme ID of Waterloo Road, not the episode ID if Waterloo Road series 13 episode 8. Which only confirms what I've already mentioned several times; you can only add programmeID's, not episodeID's.

Regarding the parser: If I am to make changes to the parser I must be able to establish if these changes do not inadvertently affect other data. There is apparently no way to establish which part of the parser is responsible for parsing which piece of data from which piece of the website, other than just manually test every possible list the add-on produces and hoping that you've not missed one.

I believe the current parsing process is very much in need of a revision. I appreciate that to have one single parser may have seemed a good idea at the time, and many years of maintenance may have lead to the state that it is now, but if the add-on is to remain maintainable I feel a thorough rework of the parsing process is necessary. If I am to spent time on the parser I'd rather do that than trying to figure out how a function works that in my opinion has no future. However, I don't have much time lately, so it will take some time.

I will probably figure out some way of implementing add and remove some day, but for now I leave the PR as is.

dimkroon avatar Jan 28 '24 19:01 dimkroon

Trusting that this is not just a bold assumption, but has actually been tested, I can only conclude that we are indeed seeing different websites. If I press the "+ADD'' button on that page, I find it results in a POST request to https://user.ibl.api.bbc.co.uk/ibl/v1/user/adds with {"id": "b006t1p7"} in the body. This ID is the programme ID of Waterloo Road, not the episode ID if Waterloo Road series 13 episode 8. Which only confirms what I've already mentioned several times; you can only add programmeID's, not episodeID's.

I got to apologize. I just checked the code of the website, but never actually added or removed anything until now. You are correct. Even though the controls on the website refer to the episodeID they add only the programmeID.

On a side note: I find that a rather strange concept of the website. I would not want to add a whole series if I already watched half of the seasons. Not for us to judge, though, this is how the website works.

Regarding the parser: If I am to make changes to the parser I must be able to establish if these changes do not inadvertently affect other data. There is apparently no way to establish which part of the parser is responsible for parsing which piece of data from which piece of the website, other than just manually test every possible list the add-on produces and hoping that you've not missed one.

I believe the current parsing process is very much in need of a revision. I appreciate that to have one single parser may have seemed a good idea at the time, and many years of maintenance may have lead to the state that it is now, but if the add-on is to remain maintainable I feel a thorough rework of the parsing process is necessary. If I am to spent time on the parser I'd rather do that than trying to figure out how a function works that in my opinion has no future. However, I don't have much time lately, so it will take some time.

The parser is in a pretty good shape right now. It used to be scattered across multiple functions to extract everything from HTML. Since they added JSON, life has become a lot smoother. The different cases are simply required to take care of some special cases like Most Popular, Categories, or A-Z. I would not recommend starting from scratch as you would need to take care of each of these things again.

So the question boils down to this: How do we know which episodeID belongs to which programmeID to know if it has been added or not?

  • For standard pages this is quite simple. In the "episode" section of the JSON, there is the "tleoID" field, which is what we are looking for.
  • However, standard pages may also have "related_episodes" that get parsed and added to the list. I assume they belong to the same tleoID, but there may be exceptions.
  • Most Popular is different. It contains single episodes without a programmeID in the JSON. Also, you cannot add/remove these on the website.
  • Categories does not offer add/remove on the website, but it does contain the programmeID in the JSON if available. Unfortunately, it is not "tleoID", but "tleo"->"id".
  • A-Z does not offer add/remove on the website, and it does not contain the programmeID in the JSON. Much like Most Popular.
  • Channel Highlights does not offer add/remove on the website, and it does not contain the programmeID in the JSON. Much like Most Popular.
  • Main Highlights does not offer add/remove on the website, but it does contain the programmeID for some programmes in the JSON. "tleo"->"id" like in Categories.

If we are unable to get the programmeID when scraping the website, the only alternative would be to do a recursive seach on the Added page to test for each episode that has been added. I am sure you agree this is not manageable.

For some cases, like A-Z, it would also work to drill down into each episode link because these standard pages do have the programmeID. That would mean calling hundreds of extra pages recursively for each A-Z page. Again I am sure you agree this is not manageable.

This leads me to ask: When should we try to offer add/remove, when not, if even the website is inconsistent about it? I have no answer to that at the moment.

This question is totally independent of the discussion of having a cache or not.

CaptainTK avatar Jan 29 '24 19:01 CaptainTK