cinemagoer icon indicating copy to clipboard operation
cinemagoer copied to clipboard

Episode extraction not working for tv series that don't have seasons

Open kkel568 opened this issue 1 year ago • 2 comments

Issue description

I am trying to extract the episode list from tv series, I've come across an issue when the TV series doesn't have a season structure for example, the majority of soap operas like Emmerdale Farm. On the episode list of its imdb page (https://www.imdb.com/title/tt0068069/episodes/?year=2023&ref_=tt_ov_epl) it is organised by year rather than season. All of the episodes fall under a blank season but there are also two seasons in the list (called 1 and unknown) which have no episodes under it. When i try to use ia.update(m, 'episodes') this seems to fall over at season 1, i'm assuming because there are no episodes here.

Is there any way to update this so it could all the episodes without requiring seasons or if it could extract by year for these types of tv shows?

Version of Cinemagoer, Python and OS

Python 3.10 Cinemagoer - Version: 2022.12.27 OS: 13.2.1

Steps to reproduce the issue

ia = Cinemagoer() m = ia.get_movie('0068069') ia.update(m, 'episodes')

What's the expected result?

I expect the cinemagoer instance to update and then the ability to extract episodes from this.

What's the actual result?

Instead i get this exception: 2023-04-01 15:01:36,546 CRITICAL [imdbpy] /Users/--------/anaconda3/lib/python3.10/site-packages/imdb/_exceptions.py:32: IMDbDataAccessError exception raised; args: ({'errcode': None, 'errmsg': 'None', 'url': 'https://www.imdb.com/title/tt0068069/episodes?season=1', 'proxy': '', 'exception type': 'IOError', 'original exception': <HTTPError 400: ''>},); kwds: {}

along with this error:


KeyError Traceback (most recent call last) Cell In[214], line 3 1 ia = Cinemagoer() 2 m = ia.get_movie('0068069') ----> 3 ia.update(m, 'episodes')

File ~/anaconda3/lib/python3.10/site-packages/imdb/init.py:848, in IMDbBase.update(self, mop, info, override) 846 method = lambda *x: {} 847 try: --> 848 ret = method(mopID) 849 except Exception: 850 _imdb_logger.critical( 851 'caught an exception retrieving or parsing "%s" info set' 852 ' for mopID "%s" (accessSystem: %s)', 853 i, mopID, mop.accessSystem, exc_info=True 854 )

File ~/anaconda3/lib/python3.10/site-packages/imdb/parser/http/init.py:667, in IMDbHTTPAccessSystem.get_movie_episodes(self, movieID, season_nums) 665 nr_eps += len(other_d['data']['episodes'].get(season) or []) 666 if data_d: --> 667 data_d['data']['episodes'][season] = other_d['data']['episodes'][season] 668 else: 669 data_d = other_d

KeyError: -1

kkel568 avatar Apr 01 '23 14:04 kkel568

I think there are some issues also server-side.

From this page I get no results selecting "unknown" season.

You can get a single season with something like this, but in this case it doesn't seem to help (due to a timeout):

from imdb import Cinemagoer

ia = Cinemagoer()
m = ia.get_movie('0068069')
ia.update_movie_episodes(m, season_nums=['-1'])
print(m['episodes'])

I'm not sure how this can be solved, at the moment.

alberanid avatar Apr 30 '23 13:04 alberanid

This bug is discussed and marked as solved before. But, it's still there. I think IMDB list all episodes of this kind of shows in season 1.

However, I think what is causing the bug right now is not that it is organized as years instead of seasons, but that there are more than 50 episodes in a season. Because, IMDB only list first 50 episodes of each season and I think server can't made IMDB load more.

image

For example, Adventure Time season 5 has 52 episodes and One Piece has one season and 1k+ episodes.

onePiece=ia.get_movie("0388629") adventureTime=ia.get_movie("1305826")

These two code snippets below give the same error:

ia.update(onePiece,"episodes") ia.update(adventureTime,"episodes")


AttributeError Traceback (most recent call last) Cell In[117], line 1 ----> 1 ia.update(adventureTime,"episodes")

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\imdb_init_.py:848, in IMDbBase.update(self, mop, info, override) 846 method = lambda *x: {} 847 try: --> 848 ret = method(mopID) 849 except Exception: 850 _imdb_logger.critical( 851 'caught an exception retrieving or parsing "%s" info set' 852 ' for mopID "%s" (accessSystem: %s)', 853 i, mopID, mop.accessSystem, exc_info=True 854 )

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\imdb\parser\http_init_.py:665, in IMDbHTTPAccessSystem.get_movie_episodes(self, movieID, season_nums) 663 except: 664 pass --> 665 nr_eps += len(other_d['data']['episodes'].get(season) or []) 666 if data_d: 667 data_d['data']['episodes'][season] = other_d['data']['episodes'][season]

AttributeError: 'list' object has no attribute 'get'

However, ia.update(adventureTime,"episodes") errs much more later than ia.update(onePiece,"episodes"). So, I believe server succesfully reach season 5 but can't continue because season five is the first season that has 50+ episodes. But in One Piece, ia.update errs much more sooner which I believe because its first season already has more than 50+ episodes.

(Edit: No, i found that what causes error is a season having "unknown" name)

Also, I can fetch episodes from single seasons by ia.update_series_seasons. update_series_seasons(adventureTime,5) and update_series_seasons(onePiece,1) only returns first 50 episodes of specified season.

So, if we can somehow fetch seasons completely, I think it will return all episodes.

HalBenHB avatar Sep 12 '23 20:09 HalBenHB