casttube icon indicating copy to clipboard operation
casttube copied to clipboard

Use a more robust parser for session data messages

Open interfect opened this issue 3 years ago • 1 comments

I was having some trouble with getting the currently playing playlist: the API was returning string data which wasn't able to be parsed as JSON with the existing find("[") code:

Traceback (most recent call last):
  File "/Users/anovak/personal/pychromecast/yt.py", line 66, in <module>
    print(yt._session.get_queue_videos())
  File "/Users/anovak/personal/pychromecast/venv/lib/python3.9/site-packages/casttube/YouTubeSession.py", line 150, in get_queue_videos
    queue_playlist_id = self.get_queue_playlist_id()
  File "/Users/anovak/personal/pychromecast/venv/lib/python3.9/site-packages/casttube/YouTubeSession.py", line 138, in get_queue_playlist_id
    session_data = self.get_session_data()
  File "/Users/anovak/personal/pychromecast/venv/lib/python3.9/site-packages/casttube/YouTubeSession.py", line 129, in get_session_data
    response_list = json.loads(response_text[response_text.find("["):])
  File "/opt/homebrew/Cellar/[email protected]/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/opt/homebrew/Cellar/[email protected]/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/json/decoder.py", line 340, in decode
    raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 1 column 15 (char 14)

This PR adds code to decode the length-prefixed JSON message list that I am getting in a more principled way.

Unfortunately, it isn't sufficient to get the playlist data properly in my hands, because:

  1. Sometimes the nowPlaying message is absent from the returned list (does the bind endpoint need to be called again to stream more events?)
  2. When the playlist ID can be obtained, the watch_queue_ajax request can be made, but YouTube has started returning a 400 Bad Request error for it instead of playlist contents. Another API endpoint that returns the playlist contents may need to be determined.

But this at least prevents a JSON parsing error from being thrown.

interfect avatar Dec 27 '21 20:12 interfect

The underlying reason that we sometimes don't get the playlist ID in get_session_data() is that all calls to BIND_URL are able to return events from a single event stream, and the YouTube side keeps track of the events we've already been sent. So sometimes a previous call to the URL will return the event we care about (in e.g. _bind() or _queue_action()) and it won't be available when get_session_data() goes looking for it.

I think the YouTubeSession needs to keep its own local view of the event stream, parse the result and update it every time BIND_URL is hit, and consult that internal model to answer questions like what the current playlist ID is.

Some more information on the protocol might be gleanable from this French(?) server-side implementation: https://github.com/Lwiip/PyTube/blob/22036ffb1862d3f486aa7df40e88595f76fbfc5b/main.py

It looks like GoTubeCast ran into the problem with playlists not being fetchable anymore and just stopped fetching them in https://github.com/cbix/gotubecast/issues/21; I'm not sure if they ever figured out the new API.

interfect avatar Dec 27 '21 20:12 interfect