carbon-lang syntax for declaring global variables in a namespace

Currently, our variable syntax is:

let PATTERN = INIT;

for constants and

var PATTERN = INIT;

for mutable variables, and our syntax for declaring an entity as a namespace member is:

namespace MyNS;
fn MyNS.Func();

It's not clear how to combine these features. We could consider permitting MyNS.Var: MyType as a binding pattern, but it's not clear whether permitting a . in that context will introduce ambiguities in the grammar. It certainly looks perilous!

It's also not clear how variable declarations that have more complex bindings should be handled. For example, if we permit namespace-qualified bindings, would we allow:

namespace NS1;
namespace NS2;
var (NS1.a: i32, NS2.b: i32) = (1, 2);

Feb 08 '23 18:02 zygoloid

To note another option, we could allow:

var MyNS.(a: i32, b: i32) = (1, 2);

This sort of mirrors the fn MyNS.Func() syntax in that it puts the namespace immediately after the introducer. Also, I think it can be parsed as: var [QUALIFIER] PATTERN = INIT;, without permitting namespace-qualified bindings inside the (...).

While it could also just not be supported, I suspect this should be supported in some form in order to allow destructuring of compile-time function calls.

I might also note that if you expect class statics, it might be worth mulling what the syntax for out-of-line initialization of those should be since then it may extend the issue past namespaces. However, it may be that out-of-line initialization excludes let/var and so might not be an issue. e.g.:

class Foo {
  static let x: i32;
  static let y: i32;
}
(Foo.X, Foo.Y) = CompileTimeCall();

Feb 08 '23 19:02 jonmeow

Another option, at the cost of a little language consistency, would be to not allow arbitrary patterns in namespace-scope variable declarations at all, and instead only accept:

[let|var] [QUALIFIER] NAME : TYPE [= INIT];

Do we have a use case for pattern matching at the top level?

Feb 08 '23 21:02 zygoloid

As @jonmeow observes, destructuring a compile-time function call to initialize some constants seems like a very plausible use case. If we don't support it directly, people will have to choose between duplicating the function call, duplicating the data, or doing the structure traversal at every point of use. None of those options is disastrous, but all of them are annoying, and may even be bug-prone. I'm not sure how common that sort of use case will be, but my hunch is that it will be common enough that everyone stumbles across it sooner or later. Especially since it seems especially likely to come up in the sort of toy examples that get used in instructional materials, or written in places like Compiler Explorer.

Feb 09 '23 00:02 geoffromer

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please comment or remove the inactive label. The long term label can also be added for issues which are expected to take time. \n\n\n This issue is labeled inactive because the last activity was over 90 days ago.

May 10 '23 01:05 github-actions[bot]

More concerns arise with the handling of modifier keywords on general file-scope let declarations. For a modifier keyword, we generally want to know the target scope that the declaration is injecting names into so we can check the modifier validity, but if a declaration can introduce names in multiple scopes (or introduce no names), it's not clear how that would work.

Jan 30 '24 01:01 zygoloid

My initial leaning went towards tackling the potential grammar risk head on, and allowing patterns to (potentially) bind name-qualified names.... So:

namespace NS;
var (NS.a: i32, NS.b: i32) = ...;

My goal here is that the thing before the : is a sequence of dotted names only.

At least in irrefutable patterns, i'm not immediately seeing a deal-breaker level of ambiguity. I can imagine plausible answers with modifiers (require all scopes to allow the modifier, etc).

And I do think there will be a desire to bind two things from the return of a function call.

But then I thought about another use case that we should probably evaluate at the same time as we're considering this: static member variables, and specifically out-of-line definition (and initialization) of them. Because with generic classes and inheritance and other complexities, this seems to throw the idea of a name-qualifier-only right out the window.... And being able to move the definition and initialization of such constants out-of-line (and thus out of an API file and into an impl file) seems quite important. Maybe we should figure out the syntax for that first, and then look for an easy way to use it with namespacing as well?

Jan 30 '24 03:01 chandlerc

I think having different qualifiers on different bindings in a let or var is going to be problematic for unqualified name lookup:

namespace A;
namespace B;
let A.n: i32 = 1;
let B.n: i32 = 2;
// Ambiguous? (1, 2)? Something else?
let (A.x: i32, B.y: i32) = (n, n);

The name qualifier on a declaration sets the scope in which later portions of that declaration are parsed, so there can be only one.

We could handle this by saying that there can be qualifiers, but if so, all name bindings must use the same qualifier. But that would still lead to weirdness -- for example, the first qualifier would (presumably) change the scope in which the later qualifier is parsed. Fundamentally there seems to be a substantial difference between the semantics of declaration names (such as appear after a class / interface / namespace / fn introducer) and name bindings in patterns, and we're going to create problems for ourselves if we try to unify them.

I'm increasingly thinking that a very broad restriction (in particular, something like the one I described above) is the way to go. While that does make it harder to do destructuring of compile-time function calls, that isn't something that we've seen a pressing need for in C++, and it's still possible if needed:

// instead of this...
let (A.x: i32, {.b = B.y: i32, .c = C.z: i32}) = F();

// ... one could in general write something like ...
private let xyz: (i32, i32, i32) = [] {
  let (x: i32, {.b = y: i32, .c = z: i32}) = F();
  return {.x = x, .y = y, .z = z};
}();
let A.x: i32 = xyz.x;
let B.y: i32 = xyz.y;
let C.z: i32 = xyz.z;

We can make this a bit nicer by saying that you either get a single qualified name in a let or var (following the rule in my earlier comment), or you get an arbitrary pattern (in which the general rule is that name bindings in patterns can't be qualified). Then you can write:

// instead of this...
let (A.x: i32, {.b = B.y: i32, .c = C.z: i32}) = F();

// ...write this:
private let (x: i32, {.b = y: i32, .c = z: i32}) = F();
let A.x: i32 = package.x;
let B.y: i32 = package.y;
let C.z: i32 = package.z;

Jan 30 '24 18:01 zygoloid

We can make this a bit nicer by saying that you either get a single qualified name in a let or var (following the rule in my earlier comment), or you get an arbitrary pattern (in which the general rule is that name bindings in patterns can't be qualified).

I think that makes sense as a good starting position. It seems to give enough flexibility that we'll discover useful idioms while keeping the functionality and behavior simple and easy to both implement and explain.

Mar 01 '24 01:03 chandlerc

Leads decision: for now, let and var either have an arbitrary pattern (which do not support qualifiers) or a single qualified name.

Mar 01 '24 01:03 zygoloid

carbon-lang carbon-lang copied to clipboard

syntax for declaring global variables in a namespace

carbon-lang
carbon-lang copied to clipboard