component-model icon indicating copy to clipboard operation
component-model copied to clipboard

Why `create-2d` is an invalid identifier?

Open ghost opened this issue 2 years ago • 4 comments

(Copy from https://github.com/bytecodealliance/wit-bindgen/issues/578)

I am declaring a function that creates a two-dimensional texture:

default interface client-experimental-texture {
    create-2d: func() -> string
}

This fails with:

  Caused by:
      invalid character in identifier '2'
           --> wit/client-experimental-texture.wit:2:5
            |
          2 |     create-2d: func() -> string
            |     ^', crates/wasm/build.rs:49:58

ghost avatar May 17 '23 06:05 ghost

From the component model's perspective, the question here is

should create-2d be a legal kebab-case identifier?

The grammar as defined in Binary.md is as follows and only allows numbers after the first letter of each word.

label               ::= w:<word>                                           => w
                      | l:<label> '-' w:<word>                             => l-w
word                ::= w:[0x61-0x7a] x*:[0x30-0x39,0x61-0x7a]*            => char(w)char(x)*
                      | W:[0x41-0x5a] X*:[0x30-0x39,0x41-0x5a]*            => char(W)char(X)*

In order to make identifiers like create-2d legal without allowing identifiers to begin with numbers, we would need to add a different production for the first word than for subsequent words.

esoterra avatar May 17 '23 15:05 esoterra

In order to make identifiers like create-2d legal without allowing identifiers to begin with numbers, we would need to add a different production for the first word than for subsequent words.

A reason not to change anything here is that it'd not be clear how to generate unambiguous bindings for many languages. For example, if the WIT contains both create-2d and create2d, what should the generated names look like in a language like JS or Java, which camel-cases identifiers? The most natural answer for both would be create2d (or Create2d if it's a class name), and I really can't see any good alternatives :/

tschneidereit avatar May 17 '23 15:05 tschneidereit

I think it's worth considering relaxing the grammar of kebab-names to allow leading numbers in words after the first as Kyle suggested. This would also be useful if we wanted to better-support version numbers, as suggested in #134.

But Till has a good point. Separately, we've talked about requiring kebab names to be case-insensitively unique (so that, e.g., you can't have both create-XML and create-xml, as this would conflict in a casing scheme that mapped both to createXml). It might be a good idea to further tighten that requirement and require kebab names to be hyphen-insensitively unique (so that you can't have both ab and a-b), which may be independently valuable even outside of Till's example above).

lukewagner avatar May 17 '23 20:05 lukewagner

It might be a good idea to further tighten that requirement and require kebab names to be hyphen-insensitively unique

Ah, I like that as a solution! And if there turns out to be a reason to not go quite that far, then applying it just to hyphens before numbers would be a good approach, too

tschneidereit avatar May 17 '23 20:05 tschneidereit