yt-dlp
yt-dlp copied to clipboard
Fix deezer extractor
Description of the pull request and other information
This pull request fix the Deezer extractor with full song support and closes #1591.
- Add blowfish encipher
- Add deezer downloader
- Add full song support
- Add artist extractor
- Add track extractor
- Add episode extractor
- Add show extractor
Before submitting this pull request I have:
- [x] At least skimmed through contributing guidelines including yt-dlp coding conventions
- [x] Searched the bugtracker for similar pull requests
- [x] Checked the code with flake8 and ran relevant tests
In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check one of the following options:
- [x] I am the original author of the DeezerExtractor code and I am willing to release it under Unlicense. See youtube-dl.
- [x] I am not the original author of the Blowfish key computation and cipher code but it is in public domain or released under Unlicense (provide reliable evidence). See youtube-dl.
What is the purpose of my pull request?
- [x] Fix or improvement to an extractor (Make sure to add/update tests)
- [x] New extractor (Piracy websites will not be accepted)
- [x] Core bug fix/improvement
- [x] New feature (It is strongly recommended to open an issue first)
What about the failed core tests ?
Dont import legacy compat variables
Dont import legacy compat variables
So I copy the function code in the file ?
So I copy the function code in the file ?
No, you use the functions provided by Python, for example the chr
builtin instead of compat_chr
. Compat functions were workarounds which are no longer needed since support for old Python versions is not required.
The decryption process could potentially be done more cleanly if the calculation of the key happens inside the downloader and is not dependant on the key
property being present.
Also Cryptodome provides an internal Blowfish implementation (Cryptodome.Cipher.Blowfish
). It might be worth using that one and fallback only if Cryptodome not available, like doing already for AES inside dependencies.py
No, you use the functions provided by Python, for example the
chr
builtin instead ofcompat_chr
. Compat functions were workarounds which are no longer needed since support for old Python versions is not required.
Done.
The decryption process could potentially be done more cleanly if the calculation of the key happens inside the downloader and is not dependant on the
key
property being present.
If by "inside the downloader" you mean the DeezerBaseInfoExtractor
, then it's done.
Also Cryptodome provides an internal Blowfish implementation (
Cryptodome.Cipher.Blowfish
). It might be worth using that one and fallback only if Cryptodome not available, like doing already for AES insidedependencies.py
I'll do it later.
Related PRs: https://github.com/ytdl-org/youtube-dl/pull/13194 https://github.com/ytdl-org/youtube-dl/pull/22074
EDIT: the 2nd PR I linked doesn't support full length songs, according to https://github.com/ytdl-org/youtube-dl/pull/22074#issuecomment-602095199:
@Jan-NiklasB , note that the full length song support is available on the branch [deezer uncipher](https://github.com/LucBerge/youtube-dl/tree/deezer_uncipher).
Unfortunately the linked repo was DMCAed when youtube-dl was DMCAed, but it might be available on the wayback machine (which seems to be down right now)
Unfortunately the linked repo was DMCAed when youtube-dl was DMCAed, but it might be available on the wayback machine (which seems to be down right now)
Yes it is : https://web.archive.org/web/20200917071235/https://github.com/LucBerge/youtube-dl but the deezer_uncipher
branch is not.
Regarding this comment (and a discussion I had with someone 😅) authentication is not used in my code. It explains why the MD5_ORIGIN
key is not present and why MP3_64
, MP3_128
and MP3_MISC
are the only format I have access to even though all of them exists. Authentication code must be added in the futur for better song quality.
@gamer191 GitHub don't want to unlock my repo, they proposed to remove it so I can fork it again. My commits are still on my local computer, I can push it to a new github repo but it can be DMCA again for the same reason youtube-dl was. So maybe it's not the best idea.
I haven't actually tested this code and therefore can not confirm if it actually works or not, but the new key extractor (in theory) is definitely a much better implementation than hardcoding it for legal reasons.
You mean generating the key dynamically based on the code you provided?
I haven't actually tested this code and therefore can not confirm if it actually works or not, but the new key extractor (in theory) is definitely a much better implementation than hardcoding it for legal reasons.
You mean generating the key dynamically based on the code you provided?
Yes. I'm saying you did a good job updating that.
job updating tha
Thanks. What can I do for core tests failing? I have not even touch the files it fails on.
Due to reasons we have discussed above (especially https://github.com/yt-dlp/yt-dlp/pull/4935#discussion_r978580754), I don't think it is a good idea to merge this code. In theory, we would have some protection by not hard-coding the key, but even so, it is a very risky move for us[^1].
Since our plugin system has now matured, I recommend distributing this yourself as a plugin instead. This way, any DMCA will only affect your plugin, and not the entire project. While there are no downloader plugins yet, you can hack it in due to the flexibility of Python
from yt_dlp.downloader import PROTOCOL_MAP
PROTOCOL_MAP['deezer'] = DeezerFD
PS: Also, any PR adding new downloader is currently being held on pause since we are discussing re-working how IEs declare FDs
[^1]: I have neither the time, the will, or the economic ability to fight it in court
Due to reasons we have discussed above (especially #4935 (comment)), I don't think it is a good idea to merge this code. In theory, we would have some protection by not hard-coding the key, but even so, it is a very risky move for us.
It might be a better idea to dynamically generate the key or just make users provide it themselves.
The key is now dynamically generated as suggested
Since our plugin system has now matured, I recommend distributing this yourself as a plugin instead. This way, any DMCA will only affect your plugin, and not the entire project. While there are no downloader plugins yet, you can hack it in due to the flexibility of Python
from yt_dlp.downloader import PROTOCOL_MAP PROTOCOL_MAP['deezer'] = DeezerFD
My goal in this PR is to share the extractor with others so people will be able to download too. Nobody will use my extractor as a plugin...
PS: Also, any PR adding new downloader is currently being held on pause since we are discussing re-working how IEs declare FDs
Any link to the issue/discussion ?
Nobody will use my extractor as a plugin...
People who need it will find it. It can be added to the wiki and I could change even the current extractor warning to point to plugins if desired.
PS: Also, any PR adding new downloader is currently being held on pause since we are discussing re-working how IEs declare FDs
Any link to the issue/discussion ?
A lot of the discussion happens on discord b/w maintainers. https://github.com/yt-dlp/yt-dlp/pull/3048 is one possible implementation that came out of it.
Plugin repos can be tagged with yt-dlp-plugins to be discoverable. With the new plugin system we are aiming to make it a viable alternative to having extractor support in core. It's pretty new so not too many have heard of it yet nor is there many public plugins listed yet, so will take some time.
Also, after some investigation a while back, the origins of some of this code are potentially from DMCA'd sources and likely not under unlicense (it goes further back than the PR on youtube-dl), so we likely cannot include it anyway in it's current form.
So we likely cannot include it anyway in it's current form.
Ok. So in which form can you accept it ?