wren icon indicating copy to clipboard operation
wren copied to clipboard

Scope resolution quirk

Open ruby0x1 opened this issue 4 years ago • 28 comments

Working:

class Test {
  Test { "test" }
  construct new() {
    System.print(Test)
  }
}

Test.new()

Not working: [compile line 4] Error at 'TEST': Variable is used but not defined.

class Test {
  TEST { "test" }
  construct new() {
    System.print(TEST)
  }
}

Test.new()

ruby0x1 avatar Feb 15 '21 18:02 ruby0x1

This works well.

In example 1, you refer to the class Test and not the getter Test. In example 2, there is no global variable named TEST, so this is an error.

According to the docs:

If the name starts with a lowercase letter, it is treated as a method on this.

Else (starts with a capital letter), it is treated as a global variable.

ChayimFriedman2 avatar Feb 15 '21 18:02 ChayimFriedman2

Yes I'm aware of the scope resolution rules. This is still a confusing case.

ruby0x1 avatar Feb 15 '21 18:02 ruby0x1

So what do you want this to do? Break the rules?

Edit: The only confusing part IMO is that printing a class prints its name. Maybe we should print something like <class Name>.

ChayimFriedman2 avatar Feb 15 '21 18:02 ChayimFriedman2

Well if we break/relax the rules, we could have top level non capitalized values/classes which would be nice...

mhermier avatar Feb 15 '21 18:02 mhermier

But how can we break the rules without dynamic search, which will hardly hurt Wren's performance?

ChayimFriedman2 avatar Feb 15 '21 18:02 ChayimFriedman2

The goal would be better error messages.

ruby0x1 avatar Feb 15 '21 18:02 ruby0x1

So let's start from the end: what do you think is the best error message here?

Edit: Problem is, compile can't determine if a name was declared as a method. If it could we wouldn't have these rules. It can see that a name was declared in this class (which can improve errors), but because of the single pass nature of our compiler, it can do so only for methods declared before. And I think it would be pretty confusing if you'll get a better error message but only if you declared your members in the right order.

Edit 2: Don't get me wrong: I think that improving error messages is the most important work for any compiler. But unfortunately it's almost impossible for a single pass compiler.

Unrelated edit 3: It is an important goal for Wrenalyzer 😃.

ChayimFriedman2 avatar Feb 15 '21 18:02 ChayimFriedman2

If we need to go that route, it means banning/warn on capitalized method names, not the best route. Solving properly this, will probably break a lot of code requiring usage of "this." in lots of spaces.

mhermier avatar Feb 15 '21 18:02 mhermier

Well you can of course always qualify the method name with this (or with the class name if it had been static) to make the second example work:

class Test {
  TEST { "test" }
  construct new() {
    System.print(this.TEST)
  }
}

Test.new()

As far as error messages are concerned a small improvement you could make would be to tack on the words 'or visible' to the present one. So the error message for the second example would now read:

[compile line 4] Error at 'TEST': Variable is used but not defined or visible.

This might help folks who were confused by the present error message when the method definition was staring them in the face that it wasn't actually visible in the relevant scope because of the scope resolution rules.

PureFox48 avatar Feb 16 '21 09:02 PureFox48

One solution is to make an extra regular name lookup to see if there are candidates, when looking up a capitalized identifier at expression level, and issue an error/warning accordingly?

mhermier avatar Feb 16 '21 09:02 mhermier

One solution is to make an extra regular name lookup to see if there are candidates, when looking up a capitalized identifier at expression level, and issue an error/warning accordingly?

See:

Problem is, compile can't determine if a name was declared as a method. If it could we wouldn't have these rules. It can see that a name was declared in this class (which can improve errors), but because of the single pass nature of our compiler, it can do so only for methods declared before. And I think it would be pretty confusing if you'll get a better error message but only if you declared your members in the right order.

ChayimFriedman2 avatar Feb 16 '21 09:02 ChayimFriedman2

I know that, this would be a best effort solution, not a complete one. To have a complete one, the class compiler would need to remember and resolve this when the class is fully defined. This could be done, but should always be resolved by a compilation error, because of single pass.

mhermier avatar Feb 16 '21 09:02 mhermier

But how expensive is the extra name lookup going to be and can that expense be justified just to get a better error message?

PureFox48 avatar Feb 16 '21 09:02 PureFox48

I would say it would have a complexity of O(n^2) but realistically it should be more near of O(n), since it is traversing every possible method to find a an unresolved symbol which should be quite low.

That said we are defeated by the fact de can only look up current class definition, and can't introspect the parent class. This solution is better but still best effort.

mhermier avatar Feb 16 '21 10:02 mhermier

Tht capitalization trick can be nice, but in practice it is a huge pain... I'll probably do a deviation on the syntax (at least for my personal use) to remove it and try to see where it leads me to...

mhermier avatar Feb 16 '21 10:02 mhermier

Personally, I don't find capitalization much of an issue in practice. I just try to follow two simple rules:

  1. Avoid using capitalized method names if I can.

  2. When defining a global use a capitalized name if I know or think it might be used from within a class.

Often the globals are just other classes (no problem) or constants. If you follow the C convention of using SCREAMING_SNAKE_CASE for such constants then again that's no problem at all.

PureFox48 avatar Feb 16 '21 10:02 PureFox48

Problems starts to really shows when you are trying to do something like what the original example do: having domain specific constants. But it also shows when you want to try to make functor like methods. You can name class/variable with lower case, but you can't reference them from within methods... Making top level variable unusable unless capitalized...

mhermier avatar Feb 16 '21 11:02 mhermier

Yes, but what's the problem with capitalized top level functions?

ChayimFriedman2 avatar Feb 16 '21 11:02 ChayimFriedman2

Admittedly, if you're defining getter methods within a class which just return constants (say an enum like class), then you can't use capitalized names but that's not the convention in Wren anyway where in the core library we have stuff such as Num.pi, Num.tau etc.

If you want to use capitalized names for such constants, you can just place them outside the class but within the same module.

PureFox48 avatar Feb 16 '21 12:02 PureFox48

The problem is that it is not pleasant to the eye (at least for me). Maybe the real problem is that the syntax somehow impose casing so it can be usable, but does not inforce casing by erroring.

mhermier avatar Feb 16 '21 13:02 mhermier

Just wanted to add my 2 cents here. But this seems related. I also find it odd that the semantics of these are different:

System.print(A) // (null)
var A 

and

System.print(a) // error
var a

I had to dig through the code to discover that at the top level it allows late binding for variables that start with a capital letter.

I’m actually fine with the solution, but if this rule exists, it should be in variable documentation, not classes.

Additionally, I think I think non-capitalized global scoped variables should warn. What purpose do they serve? My only guess is private scoped module exports?

GiffE avatar Apr 11 '21 12:04 GiffE

Additionally, I think I think non-capitalized global scoped variables should warn.

If they would, we would end up with js-style IIFE ((function() { ... })()).

ChayimFriedman2 avatar Apr 11 '21 12:04 ChayimFriedman2

Additionally, I think I think non-capitalized global scoped variables should warn.

If they would, we would end up with js-style IIFE ((function() { ... })()).

Why? Again, what is the purpose of a non-capitalized top level variable? Users love lowercase names so much they would make an IIFE just to name variables with a lower case? Why not just move down a scope then?

heck here’s another semantic difference:

{
  System.print(A) // error
  var A
} 

It’s just wildly unclear why these 3 pieces of code would have different results. And it’s not documented.

GiffE avatar Apr 11 '21 13:04 GiffE

There is a quick fix, that was not added to ease writing: forward declaration of class. It would solve a bunch of problems, related to variable lookup...

mhermier avatar Apr 11 '21 13:04 mhermier

Why? Again, what is the purpose of a non-capitalized top level variable? Users love lowercase names so much they would make an IIFE just to name variables with a lower case? Why not just move down a scope then?

No, but users like executing calculations that require variables when importing module.

ChayimFriedman2 avatar Apr 11 '21 13:04 ChayimFriedman2

There is a quick fix, that was not added to ease writing: forward declaration of class. It would solve a bunch of problems, related to variable lookup...

I get that this is the reason for late binding globals with a Capital letter. It’s a very good reason and why initially I stated that I’m fine with the “ruby solution”. The real difference is that Ruby has name semantics that persist regardless of scope. $myglobal is always a global. In Wren, rules of variable names change when the scope is pushed. It’s very quirky.

Why? Again, what is the purpose of a non-capitalized top level variable? Users love lowercase names so much they would make an IIFE just to name variables with a lower case? Why not just move down a scope then?

No, but users like executing calculations that require variables when importing module.

So my guess of creating “c-style” static/module variables that are inaccessible was correct. I’m also fine with this answer/solution but it’s not evident immediately or documented.

Modern js avoids this issue by using an export keyword that pairs nicely with import. But that’s a more lofty language change. So I won’t make that suggestion.

Perhaps a warning instead then on lower scope names using Capitals is more correct? What purpose do they serve?

GiffE avatar Apr 11 '21 13:04 GiffE

I write quite a lot of shortish scripts using Wren-cli.

Sometimes I don't use classes at all - just top-level functions which have no problem accessing global variables which begin with a lower-case letter. So they do have some uses.

PureFox48 avatar Apr 11 '21 18:04 PureFox48

My personal preference would be for straight lexical scoping, plus methods hoisted automatically to the top of the class definition. But I have no idea how much effort that would be :) .

One thing I can think of is permitting forward declarations of methods, and issuing compile-time errors for references to unknown names. E.g.,

class Test {
  method TEST  // <== forward
  construct new() {
    System.print(TEST)
  }
  TEST { "test" }
}

You could still use this.foo for a late-bound foo if you didn't want to say method foo up front.

(@mhermier I'm not sure if this is what you were referring to in https://github.com/wren-lang/wren/issues/928#issuecomment-817307034 or not. If it is, I agree with you :) )

cxw42 avatar Apr 24 '21 20:04 cxw42