yt-dlp icon indicating copy to clipboard operation
yt-dlp copied to clipboard

Fix deezer extractor

Open LucBerge opened this issue 2 years ago • 16 comments

Description of the pull request and other information

This pull request fix the Deezer extractor with full song support and closes #1591.

  • Add blowfish encipher
  • Add deezer downloader
  • Add full song support
  • Add artist extractor
  • Add track extractor
  • Add episode extractor
  • Add show extractor

Before submitting this pull request I have:

In order to be accepted and merged into yt-dlp each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • [x] I am the original author of the DeezerExtractor code and I am willing to release it under Unlicense. See youtube-dl.
  • [x] I am not the original author of the Blowfish key computation and cipher code but it is in public domain or released under Unlicense (provide reliable evidence). See youtube-dl.

What is the purpose of my pull request?

LucBerge avatar Sep 15 '22 17:09 LucBerge

What about the failed core tests ?

LucBerge avatar Sep 15 '22 17:09 LucBerge

Dont import legacy compat variables

pukkandan avatar Sep 15 '22 17:09 pukkandan

Dont import legacy compat variables

So I copy the function code in the file ?

LucBerge avatar Sep 15 '22 17:09 LucBerge

So I copy the function code in the file ?

No, you use the functions provided by Python, for example the chr builtin instead of compat_chr. Compat functions were workarounds which are no longer needed since support for old Python versions is not required.

The decryption process could potentially be done more cleanly if the calculation of the key happens inside the downloader and is not dependant on the key property being present.

Also Cryptodome provides an internal Blowfish implementation (Cryptodome.Cipher.Blowfish). It might be worth using that one and fallback only if Cryptodome not available, like doing already for AES inside dependencies.py

Grub4K avatar Sep 15 '22 19:09 Grub4K

No, you use the functions provided by Python, for example the chr builtin instead of compat_chr. Compat functions were workarounds which are no longer needed since support for old Python versions is not required.

Done.

The decryption process could potentially be done more cleanly if the calculation of the key happens inside the downloader and is not dependant on the key property being present.

If by "inside the downloader" you mean the DeezerBaseInfoExtractor, then it's done.

Also Cryptodome provides an internal Blowfish implementation (Cryptodome.Cipher.Blowfish). It might be worth using that one and fallback only if Cryptodome not available, like doing already for AES inside dependencies.py

I'll do it later.

LucBerge avatar Sep 15 '22 20:09 LucBerge

Related PRs: https://github.com/ytdl-org/youtube-dl/pull/13194 https://github.com/ytdl-org/youtube-dl/pull/22074

EDIT: the 2nd PR I linked doesn't support full length songs, according to https://github.com/ytdl-org/youtube-dl/pull/22074#issuecomment-602095199: @Jan-NiklasB , note that the full length song support is available on the branch [deezer uncipher](https://github.com/LucBerge/youtube-dl/tree/deezer_uncipher). Unfortunately the linked repo was DMCAed when youtube-dl was DMCAed, but it might be available on the wayback machine (which seems to be down right now)

gamer191 avatar Sep 18 '22 15:09 gamer191

Unfortunately the linked repo was DMCAed when youtube-dl was DMCAed, but it might be available on the wayback machine (which seems to be down right now)

Yes it is : https://web.archive.org/web/20200917071235/https://github.com/LucBerge/youtube-dl but the deezer_uncipher branch is not.

LucBerge avatar Sep 20 '22 11:09 LucBerge

Regarding this comment (and a discussion I had with someone 😅) authentication is not used in my code. It explains why the MD5_ORIGIN key is not present and why MP3_64, MP3_128 and MP3_MISC are the only format I have access to even though all of them exists. Authentication code must be added in the futur for better song quality.

@gamer191 GitHub don't want to unlock my repo, they proposed to remove it so I can fork it again. My commits are still on my local computer, I can push it to a new github repo but it can be DMCA again for the same reason youtube-dl was. So maybe it's not the best idea.

LucBerge avatar Oct 03 '22 07:10 LucBerge

I haven't actually tested this code and therefore can not confirm if it actually works or not, but the new key extractor (in theory) is definitely a much better implementation than hardcoding it for legal reasons.

You mean generating the key dynamically based on the code you provided?

LucBerge avatar Mar 08 '23 12:03 LucBerge

I haven't actually tested this code and therefore can not confirm if it actually works or not, but the new key extractor (in theory) is definitely a much better implementation than hardcoding it for legal reasons.

You mean generating the key dynamically based on the code you provided?

Yes. I'm saying you did a good job updating that.

Ryan5453 avatar Mar 08 '23 12:03 Ryan5453

job updating tha

Thanks. What can I do for core tests failing? I have not even touch the files it fails on.

LucBerge avatar Mar 08 '23 12:03 LucBerge

Due to reasons we have discussed above (especially https://github.com/yt-dlp/yt-dlp/pull/4935#discussion_r978580754), I don't think it is a good idea to merge this code. In theory, we would have some protection by not hard-coding the key, but even so, it is a very risky move for us[^1].

Since our plugin system has now matured, I recommend distributing this yourself as a plugin instead. This way, any DMCA will only affect your plugin, and not the entire project. While there are no downloader plugins yet, you can hack it in due to the flexibility of Python

from yt_dlp.downloader import PROTOCOL_MAP
PROTOCOL_MAP['deezer'] = DeezerFD

PS: Also, any PR adding new downloader is currently being held on pause since we are discussing re-working how IEs declare FDs

[^1]: I have neither the time, the will, or the economic ability to fight it in court

pukkandan avatar Mar 08 '23 13:03 pukkandan

Due to reasons we have discussed above (especially #4935 (comment)), I don't think it is a good idea to merge this code. In theory, we would have some protection by not hard-coding the key, but even so, it is a very risky move for us.

It might be a better idea to dynamically generate the key or just make users provide it themselves.

The key is now dynamically generated as suggested

Since our plugin system has now matured, I recommend distributing this yourself as a plugin instead. This way, any DMCA will only affect your plugin, and not the entire project. While there are no downloader plugins yet, you can hack it in due to the flexibility of Python

from yt_dlp.downloader import PROTOCOL_MAP
PROTOCOL_MAP['deezer'] = DeezerFD

My goal in this PR is to share the extractor with others so people will be able to download too. Nobody will use my extractor as a plugin...

PS: Also, any PR adding new downloader is currently being held on pause since we are discussing re-working how IEs declare FDs

Any link to the issue/discussion ?

LucBerge avatar Mar 08 '23 13:03 LucBerge

Nobody will use my extractor as a plugin...

People who need it will find it. It can be added to the wiki and I could change even the current extractor warning to point to plugins if desired.

PS: Also, any PR adding new downloader is currently being held on pause since we are discussing re-working how IEs declare FDs

Any link to the issue/discussion ?

A lot of the discussion happens on discord b/w maintainers. https://github.com/yt-dlp/yt-dlp/pull/3048 is one possible implementation that came out of it.

pukkandan avatar Mar 08 '23 13:03 pukkandan

Plugin repos can be tagged with yt-dlp-plugins to be discoverable. With the new plugin system we are aiming to make it a viable alternative to having extractor support in core. It's pretty new so not too many have heard of it yet nor is there many public plugins listed yet, so will take some time.

Also, after some investigation a while back, the origins of some of this code are potentially from DMCA'd sources and likely not under unlicense (it goes further back than the PR on youtube-dl), so we likely cannot include it anyway in it's current form.

coletdjnz avatar Mar 08 '23 18:03 coletdjnz

So we likely cannot include it anyway in it's current form.

Ok. So in which form can you accept it ?

LucBerge avatar Mar 08 '23 21:03 LucBerge