yt-dlc
yt-dlc copied to clipboard
[Broken]Facebook private (friends only and private groups) error handling response is broken
Checklist
- [x] I'm reporting a broken site support
- [x] I've verified that I'm running youtube-dlc version 2020.10.26
- [x] I've checked that all provided URLs are alive and playable in a browser
- [x] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [x] I've searched the bugtracker for similar issues including closed ones
Verbose log
./testdlc -v -F https://www.facebook.com/100002659934141/videos/3355692847862680/
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'-F', u'https://www.facebook.com/100002659934141/videos/3355692847862680/']
[debug] Loading archive file None
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dlc version 2020.10.25
[debug] Python version 2.7.16 (CPython) - Darwin-19.6.0-x86_64-i386-64bit
[debug] exe versions: none
[debug] Proxy map: {}
[facebook] 3355692847862680: Downloading webpage
[facebook] 3355692847862680: Downloading webpage
[facebook] 3355692847862680: Downloading webpage
ERROR: Cannot parse data; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dlc -U to update. Be sure to call youtube-dlc with the --verbose flag and include its complete output.
Traceback (most recent call last):
File "./testdlc/youtube_dlc/YoutubeDL.py", line 830, in extract_info
ie_result = ie.extract(url)
File "./testdlc/youtube_dlc/extractor/common.py", line 532, in extract
ie_result = self._real_extract(url)
File "./testdlc/youtube_dlc/extractor/facebook.py", line 484, in _real_extract
video_id, fatal_if_no_video=True)
File "./testdlc/youtube_dlc/extractor/facebook.py", line 380, in _extract_from_url
raise ExtractorError('Cannot parse data')
ExtractorError: Cannot parse data; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; type youtube-dlc -U to update. Be sure to call youtube-dlc with the --verbose flag and include its complete output.
Description
Private videos used to work fine until lately. I assume the server response from FB changed or something. As a result, ytdlc does not how to handle it and ask for cookies or login, for example.
does the video have a copy URL or similar ?? play the video and in the upper right hand corner click on the 3 dots ... see if there's a download URL / copy link that you can use to feed the video into youtube-dlc
does the video have a copy URL or similar ?? play the video and in the upper right hand corner click on the 3 dots ... see if there's a download URL / copy link that you can use to feed the video into youtube-dlc
You are right. If the video is owned by the user, you can download it via FB own UI now. But, what if it's a video shared in a private group? Not sure. Edited: got confirmation that private group videos do not have "download video" option available. Even if you are a member. Here's a test link, in case needed for private group video: https://www.facebook.com/1051184515/videos/10220691460290341/
ytdl used to work just fine with these vids and gave either "need to log in" or "cookies needed" type of error. Now it just crashes. Guess, it would be a good thing to bring back proper error handling for these cases.
Hey @someziggyman and FYI @blackjack4494
I was able to resolve this issue by changing one regex. I was considering making a pull request but I don't really know how this change affects the rest of the extractor and other types of videos on FB.
Here's a test link, in case needed for private group video: https://www.facebook.com/1051184515/videos/10220691460290341/
Could you test if this change works for your case?
Find this section in extractor/facebook.py#L366 and replace the regex for the fb_dtsg
param. Should also supply credentials or use a cookie header of course.
# Video info not in first request, do a secondary request using
# tahoe player specific URL
tahoe_data = self._download_webpage(
self._VIDEO_PAGE_TAHOE_TEMPLATE % video_id, video_id,
data=urlencode_postdata({
'__a': 1,
'__pc': self._search_regex(
r'pkg_cohort["\']\s*:\s*["\'](.+?)["\']', webpage,
'pkg cohort', default='PHASED:DEFAULT'),
'__rev': self._search_regex(
r'client_revision["\']\s*:\s*(\d+),', webpage,
'client revision', default='3944515'),
'fb_dtsg': self._search_regex(
- r'"DTSGInitialData"\s*,\s*\[\]\s*,\s*{\s*"token"\s*:\s*"([^"]+)"',
+ r'"MRequestConfig"\s*,\s*\[\]\s*,\s*{\s*"dtsg"\s*:\s*{\s*"token"\s*:\s*"([^"]+)"',
webpage, 'dtsg token', default=''),
}),
headers={
'Content-Type': 'application/x-www-form-urlencoded',
})
Hey @someziggyman and FYI @blackjack4494
I was able to resolve this issue by changing one regex. I was considering making a pull request but I don't really know how this change affects the rest of the extractor and other types of videos on FB.
Here's a test link, in case needed for private group video: https://www.facebook.com/1051184515/videos/10220691460290341/
Could you test if this change works for your case?
Find this section in extractor/facebook.py#L366 and replace the regex for the
fb_dtsg
param. Should also supply credentials or use a cookie header of course.# Video info not in first request, do a secondary request using # tahoe player specific URL tahoe_data = self._download_webpage( self._VIDEO_PAGE_TAHOE_TEMPLATE % video_id, video_id, data=urlencode_postdata({ '__a': 1, '__pc': self._search_regex( r'pkg_cohort["\']\s*:\s*["\'](.+?)["\']', webpage, 'pkg cohort', default='PHASED:DEFAULT'), '__rev': self._search_regex( r'client_revision["\']\s*:\s*(\d+),', webpage, 'client revision', default='3944515'), 'fb_dtsg': self._search_regex( - r'"DTSGInitialData"\s*,\s*\[\]\s*,\s*{\s*"token"\s*:\s*"([^"]+)"', + r'"MRequestConfig"\s*,\s*\[\]\s*,\s*{\s*"dtsg"\s*:\s*{\s*"token"\s*:\s*"([^"]+)"', webpage, 'dtsg token', default=''), }), headers={ 'Content-Type': 'application/x-www-form-urlencoded', })
Appreciate your contribution and help with this!
Indeed this fix works and does not seem to affect regular public videos like this: https://www.facebook.com/watch/?v=538723623491744 Tested several cases of these type.
However, to make this more usable and friendly, I assume some error handling is needed for this "private videos" case.. I mean, even with this fix working there's no way for the user to know he needs --cookies or credentials. Instead, he will get this log:
ERROR: Cannot parse data; please report this issue on https://github.com/blackjack4494/yt-dlc . Make sure you are using the latest version; type youtube-dlc -U to update. Be sure to call youtube-dlc with the --verbose flag and include its complete output. Traceback (most recent call last): File "./testdlc/youtube_dlc/YoutubeDL.py", line 830, in extract_info ie_result = ie.extract(url) File "./testdlc/youtube_dlc/extractor/common.py", line 532, in extract ie_result = self._real_extract(url) File "./testdlc/youtube_dlc/extractor/facebook.py", line 484, in _real_extract video_id, fatal_if_no_video=True) File "./testdlc/youtube_dlc/extractor/facebook.py", line 380, in _extract_from_url raise ExtractorError('Cannot parse data') ExtractorError: Cannot parse data; please report this issue on https://github.com/blackjack4494/yt-dlc . Make sure you are using the latest version; type youtube-dlc -U to update. Be sure to call youtube-dlc with the --verbose flag and include its complete output.
I also tried to download a Facebook video and it didn't work 😕 But the fix proposed by @ssaqua works great, thanks so much for looking into it! ❤️
Hi. @ssaqua. Thanks for providing this. Does this fix still work? I downloaded the master branch. edited the 1 line and then typed make like it says. When trying a download without a cookie header I also do not get the please log in error. Is there maybe a step I missed when trying to edit the file myself and doing the make or is the fix no longer working?
Edit: I am able to download when providing the username and password after the fix. I must be doing something wrong with the cookie i assume.
Edit: I am able to download when providing the username and password after the fix. I must be doing something wrong with the cookie i assume.
👍
Yeah this should still work as long as the request is properly authenticated. I use the --add-header 'Cookie: {copy-cookie-from-browser-request}'
option instead of the --cookie FILE
option.
Missed this on my initial search, so I closed my issue #240 and referenced this one, it appears this issue with the regex is still present.
The regex fix didnt work for me. Tried with Cookies File and Headers. None. Than tried this Workaround https://github.com/ytdl-org/youtube-dl/issues/27062#issuecomment-729442898 "youtube-dl --force-generic-extractor [link] and worked just fine."
Than tried this Workaround ytdl-org/youtube-dl#27062 (comment) "youtube-dl --force-generic-extractor [link] and worked just fine."
@rgime on what endpoints does that work? What do your URLs look like?
Is it something like that?
https://www.facebook.com/<user_ID>/videos/<video_ID>
Or like this?
https://www.facebook.com/groups/<group_IP>/permalink/<post_ID>/
Is it really private? Does it ask for authentication at all? What version are you using?