YoutubeExtractor icon indicating copy to clipboard operation
YoutubeExtractor copied to clipboard

YoutubeExtractor.YoutubeParseException: Could not parse the Youtube page

Open Bazmoc opened this issue 5 years ago • 15 comments

tested on 3 link all had the same error

Bazmoc avatar Nov 09 '19 11:11 Bazmoc

me too ...

mwindischbauer91 avatar Dec 07 '19 21:12 mwindischbauer91

Same issue:

at System.Text.RegularExpressions.Match.Result(String replacement) at YoutubeExtractor.DownloadUrlResolver.GetHtml5PlayerVersion(JObject json) at YoutubeExtractor.DownloadUrlResolver.GetDownloadUrls(String videoUrl, Boolean decryptSignature)

tmontney avatar Dec 08 '19 23:12 tmontney

I have found that the json don't contains args url_encoded_fmt_stream_map and adaptive_fmts so the functions GetStreamMap and GetAdaptiveStreamMap failed to provide the urls.

Any idea how to solve this?

noammaoz avatar Jan 01 '20 17:01 noammaoz

I have tried to change the GetStreamMap from json["args"]["url_encoded_fmt_stream_map"]; to json["args"]["player_response"]; and then navigate to ["streamingData"]["formats"] the same thing in GetAdaptiveStreamMap json["args"]["player_response"]; -> ["streamingData"]["adaptiveFormats"]; The function ExtractDownloadUrls is still failed because of the change of the structure. Any idea how to solve this?

noammaoz avatar Jan 02 '20 14:01 noammaoz

Thanks a lot for the answer and it seems like there is a lot of other information in the PlayerRelatedToken. This is not helping to YoutubeExtractor because this is a different page structure. do you know how can I fix YoutubeExtractor?

On Sat, Jan 4, 2020 at 11:28 PM thehighboy [email protected] wrote:

i was thinking last night and wasnt sure if you knew but you can get related and mix videos by extracting RELATED_PLAYER_ARGS like this. ` Dim PlayerRelated = New Regex("RELATED_PLAYER_ARGS':(.*),", RegexOptions.Multiline)

    Dim PlayerRelatedExtracted As String = PlayerRelated.Match(content).Result("$1")

   Dim  PlayerRelatedToken = JToken.Parse(PlayerRelatedExtracted)

` have a look at the PlayerRelatedToken theres alot of good stuff in there ;)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/flagbug/YoutubeExtractor/issues/368?email_source=notifications&email_token=ADHH36HKLGBW5DK5SDD4XILQ4D5OXA5CNFSM4JLGDDO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIDAWSI#issuecomment-570821449, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADHH36DKRBPROAQDXDJAEVDQ4D5OXANCNFSM4JLGDDOQ .

-- Noam Maoz Oron-Software http://www.oron-software.com/ Mobile: +972-54-5549481 Mail: [email protected] [email protected] Mail: [email protected] [email protected]

noammaoz avatar Jan 05 '20 17:01 noammaoz

I was able to get around this by doing the following: 1- Replace ExtractDownloadUrls by the following:

private static IEnumerable<ExtractionInfo> ExtractDownloadUrls(JObject json)
        {
            var info = new List<ExtractionInfo>();

            var formats = GetStreamMap(json);
            var adaptiveFormats = GetAdaptiveStreamMap(json);

            ExtractInfo(info, formats);
            ExtractInfo(info, adaptiveFormats);

            return info;
        }
        

2- Add the following helper function:

        private static void ExtractInfo(List<ExtractionInfo> info, JArray items)
        {
            if (items != null)
            {
                foreach (var item in items)
                {
                    bool requiresDecryption = false;
                    var url = item["url"]?.ToString();
                    info.Add(new ExtractionInfo { RequiresDecryption = requiresDecryption, Uri = new Uri(url) });
                }
            }
        }

3- Replace GetAdaptiveStreamMap by the following

        private static JArray GetAdaptiveStreamMap(JObject json)
        {
            JArray adaptiveFormat = null;
            JToken streamMap = json["args"]["player_response"];

            string streamMapString = streamMap == null ? null : streamMap.ToString();

            if (streamMapString != null)
            {
                dynamic playerResponse = JsonConvert.DeserializeObject(streamMapString);
                adaptiveFormat = playerResponse?.streamingData?.adaptiveFormats;
            }

            return adaptiveFormat;
        }

4- Replace GetStreamMap by the following:

        private static JArray GetStreamMap(JObject json)
        {
            JToken streamMap = json["args"]["player_response"];

            string streamMapString = streamMap == null ? null : streamMap.ToString();

            if (streamMapString == null || streamMapString.Contains("been+removed"))
            {
                throw new Exception("Video is removed or has an age restriction.");
            }

            dynamic playerResponse = JsonConvert.DeserializeObject(streamMapString);

            return playerResponse.streamingData?.formats;
        }

5- Replace GetHtml5PlayerVersion by the following:

        private static string GetHtml5PlayerVersion(JObject json)
        {
            var regex = new Regex(@"player[-|_](.+?).js");

            string js = json["assets"]["js"].ToString();

            return regex.Match(js).Result("$1");
        }

My fix will work only for unencrypted contents. We still need to fix the isEncrypted attribute Cheers

rizksobhi avatar Jan 05 '20 22:01 rizksobhi

Thanks a lot for the response and solution. the URL is not in var url = item["url"]?.ToString(); it should be var url = item["url"]?.ToString(); if (url == null) { url = item["cipher"]?.ToString(); }

The Decipherer.cs - DecipherWithVersion need to change from string jsUrl = string.Format("http://s.ytimg.com/yts/jsbin/player{0}.js", cipherVersion); to string jsUrl = string.Format("http://s.ytimg.com/yts/jsbin/player_{0}.js", cipherVersion);

I have still having problem to download and i think it related to encrypted/signature. i'm getting 403 error.

noammaoz avatar Jan 07 '20 09:01 noammaoz

Whats the link with the issue or are you having problems with all in general ?

thehighboy avatar Jan 08 '20 13:01 thehighboy

for example: https://www.youtube.com/watch?v=DkeiKbqa02g

noammaoz avatar Jan 08 '20 13:01 noammaoz

for example: https://www.youtube.com/watch?v=DkeiKbqa02g

i get it no problem Annotation 2020-01-08 213157

i will download this library and see if i can fix for you.

thehighboy avatar Jan 09 '20 04:01 thehighboy

Which library you are using?

noammaoz avatar Jan 09 '20 06:01 noammaoz

Which library you are using?

im using my own code but when i started my project i used the decipher from this one.i am working on this one as i type and should be able to upload it tomorrow.

thehighboy avatar Jan 09 '20 06:01 thehighboy

LinPolly/YoutubeExtractor

I fixed the structure change of Youtube. You can try it.

LinPolly avatar Jan 09 '20 06:01 LinPolly

LinPolly/YoutubeExtractor

I fixed the structure change of Youtube. You can try it.

took a quick look at it and it did download noammaoz link.i see you never changed the get title does it ever return a title that way ? and if you get the cipher once you can reuse for the lifetime of the app.Thank you now i will continue my own app as it needs love too ;)

thehighboy avatar Jan 09 '20 07:01 thehighboy

LinPolly/YoutubeExtractor

This library is working fine. Thanks a lot.

noammaoz avatar Jan 09 '20 10:01 noammaoz