oxc Untracked `arguments` in function scopes.

Related to this: https://github.com/oxc-project/oxc/issues/3374#issuecomment-2223483123

We do not keep track the arguments object in functions, It causes false positives in linters as of now and it can impose more serious limitations or errors in other crates(transformers and minifier come to mind as we might shadow the function arguments by mistake).

Jul 12 '24 15:07 rzvxa

Since you are heavily invested in the scopes right now, @Dunqing What do you think?

Jul 12 '24 15:07 rzvxa

Funny, I was wondering how we handle arguments just this morning. Sounds like we don't.

I can see 3 options:

Add arguments binding to all function scopes eagerly.
Add arguments binding to function scopes lazily, only once a reference to it is found in function body (or params).
Special case arguments so IdentifierReference for it would have no SymbolId.

Personally, I would tend towards (2) - simple, and avoids perf hit of (1).

NB: arguments is almost like an invisible function param, but there is one oddity in sloppy mode:

function f1(x) {
  // SyntaxError: Identifier 'x' has already been declared
  let x;
}

function f2(arguments) {
  // SyntaxError: Identifier 'arguments' has already been declared
  let arguments;
}

function f3() {
  // No problem
  let arguments;
}

Also, you can have 2 bindings for arguments in a function:

function f(x, y = arguments) {
    let arguments = 456;
    console.log(y[0]); // Logs 123
    console.log(arguments); // Logs 456
}
f(123);

So we'd need an extra scope for function body.

Jul 12 '24 16:07 overlookmotel

I think the option 3 isn't a really good approach here. I'm torn between 1 and 2.

The latter seems like a good thing for performance but I'm afraid it would cause issues later on because javascript is whimsical at times:

function f() {
    eval("console.log(arguments)");
}

f("a", "b", "c")

With the second approach, we still have the same issue of shadowing arguments and generating the wrong code!

Jul 12 '24 16:07 rzvxa

He had to go and bring eval() into it... 😄

But even then, I can't actually see how that can go wrong. eval() will (I assume) prevent mangling names of all bindings which it can "see". And we can just ban creating new bindings called arguments in the transformer (that's illegal in strict mode anyway).

Jul 12 '24 16:07 overlookmotel

https://github.com/eslint/eslint-scope/blob/8082caa1a0f9aae0894a0a01fef9efa7a5e509f6/lib/scope.js#L671

Jul 12 '24 16:07 Boshen

He had to go and bring eval() into it... 😄

😆

But even then, I can't actually see how that can go wrong. eval() will (I assume) prevent mangling names of all bindings which it can "see". And we can just ban creating new bindings called arguments in the transformer (that's illegal in strict mode anyway).

You are right that we shouldn't ever mangle arguments if it isn't user-defined. I'm just saying since arguments are a context-sensitive identifier and javascript allows for weird ways to access variables through non-conventional ways it might cause issues further down the road.

Maybe no sane person uses new Function or eval in the production code, But javascript frameworks might generate such code and expect us to bundle it correctly for them.

I might be overthinking this.

Jul 12 '24 17:07 rzvxa

We should add an argument symbol to the function scope. Both TypeScript and typescript-eslint do the same thing

Jul 13 '24 01:07 Dunqing

What do you think of this example?

// Sloppy mode
function f(x, y = arguments) {
    let arguments = 456;
    console.log(y[0]); // Logs 123
    console.log(arguments); // Logs 456
}
f(123);

This demonstrates there can be 2 distinct bindings for arguments in a function. arguments in y = arguments references a different binding from let arguments = 456.

Jul 13 '24 19:07 overlookmotel

I was reading the eslint-scope tests today when saw an oddity I couldn't figure.

It contains arguments in the global scope. Your comment made me go and look it up, Turns out they always fake arguments in all non-function scopes. Here's my educated guess about how it is done there:

Step 1: It visits each node and starts adding their refs and IDs to the scope manager. After doing the same for the inner nodes it would then call close on the referencer.
Step 2: Close the scope(s) for the visited node.
Step 3: Try to resolve the identifier
Step 4: Do not resolve default parameters.
Step 5: If not resolved delegate it to the upper scope for resolution.
Step 6: Continue to hit the global scope.
Step 7: Resolve implicit variable declarations
Step 8: Call the vanilla super.__close at the end of global scope closing.
Step 9: Everything is valid at this stage so something like arguments would be registered as implicit/builtin/global/dangling identifier.

With these steps, your code would work similarly to this:

let arguments = [123];
function f(x, y = arguments) {
    let arguments = 456;
    console.log(y[0]); // Logs 123
    console.log(arguments); // Logs 456
}
f(123);

Jul 14 '24 01:07 rzvxa

Treating arguments as external to the function just seems weird. And it can't handle the case where there's another arguments binding in scope above e.g.:

// Sloppy mode
let arguments;
function f(x, y = arguments) {
    let arguments = 456;
}

Here we have 3 bindings for arguments, and only 2 scopes to put them in. I know this is an edge case, and will be very rare in practice, but Oxc aims for absolute correctness.

Only solutions I can see are:

Always add an extra scope for function body.
Only add an extra scope for function body if there's a reference to arguments in function params.
Treat implicit arguments as a special case and don't create a binding for it.

(1) is simpler and more conceptually correct, but imposes quite a large penalty. Every function generates 2 scopes, and functions are a common construct, so it will slow down symbol resolution significantly. (3) has been rejected. So I would lean towards (2).

Jul 14 '24 10:07 overlookmotel

I personally would prefer to find a way to avoid unnecessary scopes, Can't we declare arguments as a hoisted variable on the function scope? with that, we should be able to redeclare it in the function body.

But that's just my preference and I have nothing against your approach here, It just feels wrong to add scopes left and right as a means to simplify stuff. As you said correctness is more important. I'm not a huge fan of premature optimization.

I just have a quick question. We are wrapping the whole function in the scope right now. Is this going to change? Is the scope going to be entered after the parameters?

Jul 14 '24 12:07 rzvxa

To clarify, what I'm saying about option (2) above is:

By default, only create 1 scope for the function (as now).
Add binding for arguments to that scope.
Only if arguments is referenced in the function's params, create a further scope for function body. This will be very rare.

i.e. Handle the weird uncommon edge case as a de-opt. No overhead from extra scope in common case, but maintain complete correctness.

Jul 14 '24 13:07 overlookmotel

To clarify, what I'm saying about option (2) above is:
* By default, only create 1 scope for the function (as now).

* Add binding for `arguments` to that scope.

* _Only if_ `arguments` is referenced in the function's params, create a further scope for function body. This will be very rare.
i.e. Handle the weird uncommon edge case as a de-opt. No overhead from extra scope in common case, but maintain complete correctness.

Actually, it's not as bad as I initially thought. I'm just afraid of something like this.

When you are accessing the Ancestors of a scope you have to check if that scope is a legit or a fake one to prevent issues in places when we want to get the "actual" parent with something like scope.ancestors(xxx).nth(1) or scope.get_parent_id(xxx).

It also means that there are now some really obscure edge cases with arguments that won't get tested, It is not just about parent scope, With this approach, you'd have some identifiers that are pointing to the parent of a function's body but aren't the actual scope of the enclosing node. I feel it would cause a lot of confusion and heisenbugs for both us(especially contributions that don't get reviews by people who are aware of this) and our crate users.

What are your thoughts on this? Even if we do all of these checks doesn't it mean we won't get much performance out of this in the long run(as we start to use scope more and more)?

Edit:

I've checked the spec and even there, They only create a new lexical environment if strict is false to prevent eval from creating a new binding. Otherwise, they would always use the lexical environment of the function's callee. I know these aren't apple to apple comparisons as environments are a runtime concept and aren't 1:1 with scopes. But it might help.

But maybe I'm missing something here.

https://tc39.es/ecma262/multipage/ordinary-and-exotic-objects-behaviours.html#sec-functiondeclarationinstantiation https://tc39.es/ecma262/multipage/ordinary-and-exotic-objects-behaviours.html#sec-arguments-exotic-objects

Jul 14 '24 17:07 rzvxa

I found out how TypeScript handles arguments. TypeScript adds an arguments symbol but does not add it in scopes. https://github.com/microsoft/TypeScript/blob/d8086f14b6b97c0df34a0cc2f56d4b5926a0c299/src/compiler/utilities.ts#L11361-L11380

Jul 19 '24 06:07 Dunqing

I found out how TypeScript handles arguments. TypeScript adds an arguments symbol but does not add it in scopes. https://github.com/microsoft/TypeScript/blob/d8086f14b6b97c0df34a0cc2f56d4b5926a0c299/src/compiler/utilities.ts#L11361-L11380

Nice find. I also see tsc defines

    globals: SymbolTable;
    argumentsSymbol: Symbol;
    requireSymbol: Symbol;

Jul 19 '24 10:07 Boshen

TypeScript adds an arguments symbol but does not add it in scopes.

TS may do it, but it seems like a bad idea to me. Rather confusing, no?

Jul 19 '24 11:07 overlookmotel

For the record, I added this to the mangler

https://github.com/oxc-project/oxc/blob/0a2f68756f38f8ef003425e856061502b0c97a0a/crates/oxc_mangler/src/lib.rs#L229-L231

Sep 23 '24 11:09 Boshen

arguments can be renamed if it's a local const, let, or function declaration declared in function scope.

// Sloppy mode
function outer() {
    let arguments = 123;
    console.log(arguments);
}

function outer2() {
    function arguments() {}
    console.log(arguments);
}

But not if it's a var, or hoisted function declaration:

// Sloppy mode
function outer() {
    console.log(arguments); // Logs [1, 2, 3]
    var arguments = 456;
    console.log(arguments); // Logs 456
}
outer(1, 2, 3);

function outer2() {
    console.log(arguments); // Logs [1, 2, 3]
    {
        function arguments() {}
    }
    console.log(typeof arguments); // Logs 'function'
}
outer2(1, 2, 3);

Sep 23 '24 13:09 overlookmotel

Personally I'm in favour of adding scopes for function bodies anyway for other reasons (#5017), so that'd resolve some of the questions above.

Sep 23 '24 13:09 overlookmotel

Any progress on this? no-undef rule still report an error in my project.

Sep 14 '25 18:09 EurFelux

Any progress on this? no-undef rule still report an error in my project.

Please file an issue. This is just an improvement, not a blocker for any rules. cc @camc314

EDIT: Fixed in https://github.com/oxc-project/oxc/pull/13763

Sep 15 '25 01:09 Dunqing

Just to note that at some point in next few months, we'll be utilizing Semantic on JS side for linter JS plugins (see https://github.com/oxc-project/oxc/issues/9905#issuecomment-3289389212).

In order to do that, we'll need to make some changes to Scoping to make it compatible with ESLint.

ESLint's API includes scope.isArgumentsMaterialized() which returns whether a function's arguments binding is used.

To support that API, we'll need to address this issue somehow.

Sep 16 '25 12:09 overlookmotel