youtube-dl
youtube-dl copied to clipboard
Skillshare
Checklist
- [x] I'm reporting a new site support request
- [x] I've verified that I'm running youtube-dl version 2021.12.17
- [x] I've checked that all provided URLs are alive and playable in a browser
- [x] I've checked that none of provided URLs violate any copyrights
- [x] I've searched the bugtracker for similar site support requests including closed ones
Example URLs
- Playlist: https://www.skillshare.com/en/classes/Understanding-and-Painting-the-Head/62353942/projects
Description
I would like to see a new extractor for Skillshare so I can make backups of my classes (ex: videos)
Same request: https://github.com/yt-dlp/yt-dlp/issues/5813
The free Intro class page has useful data in its <meta>
tags, which strangely are placed in the <body>
rather than the <head>
.
Metadata from the og:
properties:
- title
- description
- image --> thumbnail
Video data from the twitter:player
properties:
- width
- height
- stream: "https://www.skillshare.com/en/sessions/download?id=3113009"
- stream:content_type: "video/mp4"
If the subscriber videos have the same structure, just passing cookies from a logged-in browser session using --cookies ...
would give access using the same extraction. However the subscriber video pages may be more complex.
In fact the generic extractor should handle the free videos since the twitter:player:stream
property is found. The extractor rejects it because it has no extension, but as the content_type
is provided that should be enough. Something like this:
# twitter:player:stream should be checked before twitter:player since
# it is expected to contain a raw stream (see
# https://dev.twitter.com/cards/types/player#On_twitter.com_via_desktop_browser)
- found = filter_video(re.findall(
- r'<meta (?:property|name)="twitter:player:stream" (?:content|value)="(.+?)"', webpage))
+ found = re.findall(
+ r'<meta (?:property|name)="twitter:player:stream" (?:content|value)="(.+?)"', webpage)
+ if found:
+ ext = mimetype2ext(get_first(re.findall(
+ r'<meta (?:property|name)="twitter:player:stream:content_type" (?:content|value)="(.+?)"', webpage), []))
+ found = found[:1] if ext else filter_video(found)
if not found:
# We look for Open Graph info:
# We have to match any number spaces between elements, some sites try to align them (eg.: statigr.am)
Any updates on this extractor?
Related to https://github.com/ytdl-org/youtube-dl/issues/9769