node-ytdl-core icon indicating copy to clipboard operation
node-ytdl-core copied to clipboard

Not working!

Open leote2001 opened this issue 1 year ago • 29 comments

Error when I try to download video.

leote2001 avatar Jul 26 '24 12:07 leote2001

the same here https://github.com/fent/node-ytdl-core/issues/1295

Dmytro-Tihunov avatar Jul 26 '24 17:07 Dmytro-Tihunov

FWIW A fix was found by this developer, although it would need porting: https://github.com/distubejs/ytdl-core

corwin-of-amber avatar Jul 26 '24 17:07 corwin-of-amber

@corwin-of-amber have you managed to implement it ? do you mean ip rotation ? but it works for one ip for once for me

Dmytro-Tihunov avatar Jul 26 '24 17:07 Dmytro-Tihunov

Interesting, @Dmytro-Tihunov; I have not investigated the fix yet, but it looks like it involves some regex updates, I did not see anything about IPs being mentioned in the commits. I was able to download multiple videos with this, although there seems to be some problem with the sound. https://github.com/distubejs/ytdl-core/commit/3df824e57fe4ce3037a91efd124b729dea38c01f

corwin-of-amber avatar Jul 26 '24 18:07 corwin-of-amber

Ok, the problem is definitely the nTransform function: https://github.com/fent/node-ytdl-core/blob/9e15c7381f1eba188aba8b536097264db6ad3f7e/lib/sig.js#L57

which can be extracted with this regexp: https://github.com/distubejs/ytdl-core/blob/7f7db1062069f13063cf0ee5d652ed33b42e28cb/lib/sig.js#L56

N_TRANSFORM_REGEXP = 'function\\(\\s*(\\w+)\\s*\\)\\s*\\{' +
  'var\\s*(\\w+)=(?:\\1\\.split\\(""\\)|String\\.prototype\\.split\\.call\\(\\1,""\\)),' +
  '\\s*(\\w+)=(\\[.*?]);\\s*\\3\\[\\d+]' +
  '(.*?try)(\\{.*?})catch\\(\\s*(\\w+)\\s*\\)\\s*\\' +
  '{\\s*return"enhanced_except_([A-z0-9-]+)"\\s*\\+\\s*\\1\\s*}' +
  '\\s*return\\s*(\\2\\.join\\(""\\)|Array\\.prototype\\.join\\.call\\(\\2,""\\))};';

corwin-of-amber avatar Jul 26 '24 19:07 corwin-of-amber

I was able to get the correct sig by replacing the function extractNCode above with:

  const extractNCode = () => {
    const N_TRANSFORM_REGEXP = 'function\\(\\s*(\\w+)\\s*\\)\\s*\\{' +
      'var\\s*(\\w+)=(?:\\1\\.split\\(""\\)|String\\.prototype\\.split\\.call\\(\\1,""\\)),' +
      '\\s*(\\w+)=(\\[.*?]);\\s*\\3\\[\\d+]' +
      '(.*?try)(\\{.*?})catch\\(\\s*(\\w+)\\s*\\)\\s*\\' +
      '{\\s*return"enhanced_except_([A-z0-9-]+)"\\s*\\+\\s*\\1\\s*}' +
      '\\s*return\\s*(\\2\\.join\\(""\\)|Array\\.prototype\\.join\\.call\\(\\2,""\\))};';

    let mo = body.match(new RegExp(N_TRANSFORM_REGEXP, 's'));
    if (mo) {
      let fnbody = mo[0];
      functions.push('var nxx=' + fnbody + 'nxx(ncode);');
    }
  };

Although this is a crude patch and is not idiomatic to this library. Should think of something cleaner.

corwin-of-amber avatar Jul 26 '24 19:07 corwin-of-amber

Better patch (although I am not sure how robust) — replace https://github.com/fent/node-ytdl-core/blob/9e15c7381f1eba188aba8b536097264db6ad3f7e/lib/sig.js#L58 with

    let functionName = utils.between(body, 'c=a.get(b))&&(c=', '(c)');

corwin-of-amber avatar Jul 26 '24 19:07 corwin-of-amber

Sadly the above patches seem to fix some low quality formats but not others. On an example video (v=1ec4gu5uJ6U) I was able to load the 360p mp4 and the 144p mp4, but all others returned a 403.

benkaiser avatar Jul 28 '24 23:07 benkaiser

Better patch (although I am not sure how robust) — replace

https://github.com/fent/node-ytdl-core/blob/9e15c7381f1eba188aba8b536097264db6ad3f7e/lib/sig.js#L58

with

    let functionName = utils.between(body, 'c=a.get(b))&&(c=', '(c)');

This one works for me, thanks. Last time this 403 errors were thrown I switched to the distube fork of ytdl ([https://github.com/distubejs/ytdl-core]), which worked but now throws 403 errors as well. Switching back to ytdl-core with this functionName fix works.

AnneAlbert-wt avatar Jul 29 '24 10:07 AnneAlbert-wt

again not working, youtube again updated their algorithm. @corwin-of-amber suddenly stop working to download youtube videos and audio

hextor1 avatar Aug 01 '24 04:08 hextor1

The n code extraction is one issue. The 403 on GET method requests which affects videos longer than 1 minute is another issue. GET requests still work on 360p default format streams (with audio included) but not on adaptive formats. The challenge seems to be to convert GET requests to POST requests, otherwise the best quality you will get is 360p.

gatecrasher777 avatar Aug 02 '24 10:08 gatecrasher777

The n code extraction is one issue. The 403 on GET method requests which affects videos longer than 1 minute is another issue. GET requests still work on 360p default format streams (with audio included) but not on adaptive formats. The challenge seems to be to convert GET requests to POST requests, otherwise, the best quality you will get is 360p.

SO whats should we do now? is there any solution?

hextor1 avatar Aug 02 '24 10:08 hextor1

SO whats should we do now? is there any solution?

There will always be a solution. The solution here will be to emulate what the YouTube site code does in the browser to fetch the streams. But that could take awhile to figure out and code. The POST requests also seem to include some encrypted payload, so a tricky exercise.

gatecrasher777 avatar Aug 02 '24 11:08 gatecrasher777

Using @distube/ytdl-core (v 4.13.7) works for us on some devices but not all... someone suggested that google is doing some A/B testing

AnneAlbert-wt avatar Aug 02 '24 21:08 AnneAlbert-wt

Regarding the nTransform function: Using the Wayback Machine (https://web.archive.org/) I was able to look at previous versions of the player. The assignment syntax varies, but in the 6 versions I observed at least two things remain constant:

  • The function containing the invocation has some recognizable string literals, notably "index.m3u8".
  • The call is followed by a .set(. E.g., c=BDa[0](c),a.set(b,c). The variable names may change, and b is usually just "n", but the .set( remains.

Given this, I propose using a regex to capture this convention, in the hopes that it will give us some breathing space for the near future. WDYT?

corwin-of-amber avatar Aug 03 '24 14:08 corwin-of-amber

So what's the solution? @corwin-of-amber

hextor1 avatar Aug 03 '24 14:08 hextor1

Currently, it looks like this:

    let mo = body.match(/index\.m3u8".*=(.*?)[.]set\(/);
    let functionName = mo && mo[1].split('(')[0];

corwin-of-amber avatar Aug 03 '24 14:08 corwin-of-amber

Hello can you tell me which line i need to be replaced? And also let me know the file name where i add this? Please share the full code here and also add this? @corwin-of-amber

hextor1 avatar Aug 03 '24 15:08 hextor1

Same as here @hextor1 https://github.com/fent/node-ytdl-core/issues/1305#issuecomment-2253373635

I.e. this is the line that needs to be replaced with the two lines above: https://github.com/fent/node-ytdl-core/blob/9e15c7381f1eba188aba8b536097264db6ad3f7e/lib/sig.js#L58 ↓

    let mo = body.match(/index\.m3u8".*=(.*?)[.]set\(/);
    let functionName = mo && mo[1].split('(')[0];

I want to see that it keeps working at least for a few days before suggesting this is a patch.

corwin-of-amber avatar Aug 03 '24 15:08 corwin-of-amber

Hello this line i need to be add it here let functionName = mo && mo[1].split('(')[0];

Will i need to be replaced 58 line? @corwin-of-amber

hextor1 avatar Aug 03 '24 15:08 hextor1

Here is my previous code please tell me where I do replace this code:

let mo = body.match(/index.m3u8".=(.?)[.]set(/); let functionName = mo && mo[1].split('(')[0];

Old code const extractNCode = () => { let functionName = utils.between(body, 'b=a.j.n||null)&&(b=', '(b)'); if (functionName.includes('[')) functionName = utils.between(body, var ${functionName.split('[')[0]}=[, ]); if (functionName && functionName.length) { const functionStart = ${functionName}=function(a); const ndx = body.indexOf(functionStart); if (ndx >= 0) { const subBody = body.slice(ndx + functionStart.length); const functionBody = var ${functionStart}${utils.cutAfterJS(subBody)};${functionName}(ncode);; functions.push(functionBody); } } };

hextor1 avatar Aug 03 '24 16:08 hextor1

In #1301 I did propose the following which avoids the problem of the nCode function name being obfuscated in everchanging layers of difficulty, by identifying the nCode block directly and thus determining its function name.

 const extractNCode = () => {
    const alphanum = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTVUWXYZ.$_0123456789';
    let functionName = '';
    let clue = body.indexOf('enhanced_except');
    if (clue < 0) clue = body.indexOf('String.prototype.split.call(a,"")');
    if (clue < 0) clue = body.indexOf('Array.prototype.join.call(b,"")');
    if (clue > 0) {
        let nstart = body.lastIndexOf('=function(a){', clue) - 1;
        while (nstart && alphanum.includes(body.charAt(nstart))) {
	    functionName = body.charAt(nstart) + functionName;
	    nstart--;
        }
    }
    if (functionName && functionName.length) {
      const functionStart = `${functionName}=function(a)`;
      const ndx = body.indexOf(functionStart);
      if (ndx >= 0) {
        const subBody = body.slice(ndx + functionStart.length);
        const functionBody = `var ${functionStart}${utils.cutAfterJS(subBody)};${functionName}(ncode);`;
        functions.push(functionBody);
      }
    }
  };

To reiterate: the 403 errors have nothing to do with the n transformation.

gatecrasher777 avatar Aug 03 '24 17:08 gatecrasher777

This one little bit better patch than previous @corwin-of-amber @gatecrasher777 const extractNCode = () => { let functionName = utils.between(body, 'b=a.j.n||null)&&(b=', '(b)'); if (functionName.includes('[')) functionName = utils.between(body, var ${functionName.split('[')[0]}=[, ]); if (functionName && functionName.length) { const functionStart = ${functionName}=function(a); const ndx = body.indexOf(functionStart); if (ndx >= 0) { const subBody = body.slice(ndx + functionStart.length); const functionBody = var ${functionStart}${utils.cutAfterJS(subBody)};${functionName}(ncode);; functions.push(functionBody); } } };

hextor1 avatar Aug 03 '24 18:08 hextor1

This one little bit better patch than previous

Sure, it will give the right result for now, but it is still trying to find the function name from a moving target.

gatecrasher777 avatar Aug 03 '24 18:08 gatecrasher777

You right sometime its find the function when I will refresh the page then its work. i hope in future some found better and accurate solution? @gatecrasher777

hextor1 avatar Aug 03 '24 18:08 hextor1

I agree with @gatecrasher777, my last patch is still heuristic, but I have tried it against the last 6 versions of player_ias (dates 05-28, 06-04, 06-05, 07-05, 07-15, 08-03) and it works consistently on all of them. The "clue" approach that is based on knowing a piece of the function code is also something that I considered. It is hard to say which is more robust/less brittle. Perhaps we need to try both and collect statistics?

Also, I would like to state that failing to find the n-transform function does indeed result in a 403 error; although, there may be other 403 errors that occur even with the right n-transform (esp. with high-bitrate formats).

corwin-of-amber avatar Aug 03 '24 20:08 corwin-of-amber

@corwin-of-amber Another breaking change occurred today. Your code in jsfiddle...

let utils = {
	between: (haystack, left, right) => {
    let pos;
    if (left instanceof RegExp) {
      const match = haystack.match(left);
      if (!match) { return ''; }
      pos = match.index + match[0].length;
    } else {
      pos = haystack.indexOf(left);
      if (pos === -1) { return ''; }
      pos += left.length;
    }
    haystack = haystack.slice(pos);
    pos = haystack.indexOf(right);
    if (pos === -1) { return ''; }
    haystack = haystack.slice(0, pos);
    return haystack;
  }
}
let body = `
var zDa=[Ema];
a.j.file==="index.m3u8"&&(delete a.j.file,a.path+="/file/index.m3u8");a.B="";a.url="";a.D&&(b="nn"[+a.D],vL(a),c=a.j[b]||null)&&(c=zDa[0](c),a.set(b,c),zDa.length||Ema(""))}};
`;
let mo = body.match(/index\.m3u8".*=(.*?)[.]set\(/);
let functionName = mo && mo[1].split('(')[0];
if (functionName.includes('[')) functionName = utils.between(body, `var ${functionName.split('[')[0]}=[`, `]`);
console.log(functionName);

Outputs Ema which is correct. The clue method also worked btw.

gatecrasher777 avatar Aug 07 '24 14:08 gatecrasher777

@corwin-of-amber Another breaking change occurred today. Your code in jsfiddle...

let utils = {
	between: (haystack, left, right) => {
    let pos;
    if (left instanceof RegExp) {
      const match = haystack.match(left);
      if (!match) { return ''; }
      pos = match.index + match[0].length;
    } else {
      pos = haystack.indexOf(left);
      if (pos === -1) { return ''; }
      pos += left.length;
    }
    haystack = haystack.slice(pos);
    pos = haystack.indexOf(right);
    if (pos === -1) { return ''; }
    haystack = haystack.slice(0, pos);
    return haystack;
  }
}
let body = `
var zDa=[Ema];
a.j.file==="index.m3u8"&&(delete a.j.file,a.path+="/file/index.m3u8");a.B="";a.url="";a.D&&(b="nn"[+a.D],vL(a),c=a.j[b]||null)&&(c=zDa[0](c),a.set(b,c),zDa.length||Ema(""))}};
`;
let mo = body.match(/index\.m3u8".*=(.*?)[.]set\(/);
let functionName = mo && mo[1].split('(')[0];
if (functionName.includes('[')) functionName = utils.between(body, `var ${functionName.split('[')[0]}=[`, `]`);
console.log(functionName);

Outputs Ema which is correct. The clue method also worked btw.

Where is placed? share file location with line no

hextor1 avatar Aug 07 '24 16:08 hextor1

Where is placed? share file location with line no

There is no file. It is just some shorthand code to test @corwin-of-amber's functionName extraction method. You can paste it into the javascript box on https://jsfiddle.net/ and run it.

gatecrasher777 avatar Aug 07 '24 18:08 gatecrasher777