Ampersand icon indicating copy to clipboard operation
Ampersand copied to clipboard

Feature: named syntactical objects with identifiers and labels.

Open stefjoosten opened this issue 3 years ago • 5 comments

Problem

We have "named syntactical objects" (nso's), such as interfaces, rules, and patterns. Each instance has a unique name. Currently, Ampersand uses that name for two different purposes: to identify the object within its scope and to label it in either the prototype or the documentation. In most cases, these purposes are served both without any trouble whatsoever. However, issues like #708 and #1212 show that these purposes can collide. For the arguments, please read the discussions on these issues.

Proposal: separate the two concerns (identification and labeling) syntactically

  1. For the purpose of identification we will use an identifier. To minimize discussions about equality, an identifier is a sequence of alphanumeric characters that starts with a letter. So, an identifier contains no ligatures, diacritics, spaces, nor any “interpretable” characters (tab, CRLF, etc.). It is case-sensitive. These choices make it easy to implement equality of two identifiers correctly and consistently across the different components (prototype framework, compiler, RAP).


   alpha ::= <a..z> | <A..Z>

   alphanum ::= alpha | <0..9>

   identifier ::= alpha alphanum*
  1. For the purpose of labeling we will use a string, delimited by double quotes.
 The label is used in the documentation and on the screen. The restrictions on this string:
    1. A double quote is escaped with a backslash.
    2. Every string is on a single line. There is no CR(LF) in the string.
  2. The compiler invents an identifier for each nso with no identifier or label
  3. The compiler uses the identifier as a label for each nso with an identifier only.
  4. The compiler uses a function, h::Label->Identifier, to determine an identifier for each nso with a label only.

Consequences for users

If a user uses an identifier, everything stays the same, e.g.

INTERFACE Foo : I[Bar] BOX [ bar : I ]

The Ampersand compiler interprets Foo as an identifier and it will create a label. So this interface will have Foo both as an identifier and as a label.

If a user uses a label that is syntactically equivalent to an identifier, everything stays the same, e.g.

INTERFACE “Foo” : I[Bar] BOX [ bar : I ]

The Ampersand compiler interprets "Foo" as a label and it will create an identifier. So this interface will have Foo both as an identifier and as a label.

Suppose a user uses a label that is not syntactically equivalent to an identifier, e.g.

INTERFACE “Fóo” : I[Bar] BOX [ bar : I ]

The compiler will use Fóo as a label and the identifier will be h(“Fóo”). That may result in a compiler error if there is another label for which h produces the same identifier. The programmer can maintain the label (e.g. for documentation purposes) by adding her own (unique) identifier, e.g.

INTERFACE myFoo “Fóo” : I[Bar] BOX [ bar : I ]

In this example, myFoo makes the nso identifiable, and “Fóo” is used as a label.

The following is wrong because Fóo contains a diacritical mark:

INTERFACE Fóo : I[Bar] BOX [ bar : I ]

The programmer can give up the diacritical mark by using Foo as an identifier, or he can let Ampersand pick an identifier by writing "Fóo", or he can pick his own label by specifying both explicitly.

In Gitbook we will document the identifier as standard, to limit the complexity for novices. In a section for advanced users, we can explain the distinction between labels and identifiers.

stefjoosten avatar Sep 13 '21 15:09 stefjoosten