rhombus-prototype icon indicating copy to clipboard operation
rhombus-prototype copied to clipboard

Proposal for dot transformers

Open slaymaker1907 opened this issue 4 years ago • 6 comments

I was thinking that while it would be nice sometimes to have dot syntax for object properties as in most languages, Rhombus could greatly expand what it is capable of by allowing the left side to be a syntax transformer.

Under this proposal, it would be required that a.b.c is initially parsed into something like (dot (dot a b) c). Rules for handling this special dot macro would be as follows.

  1. (dot x y) first checks to see if there is a dot-transformer for x (sort of like a set-transformer in Racket). If there is such a transformer, invoke it like a macro with the syntax (dot x y). Note that given parsing rules as above, the syntax object has exactly three objects: dot, the first argument tree, and the second argument tree.
  2. If there is no custom transformer for x, then expand to something like (dot-dyn x y) where the dot will be expanded at runtime. This will likely end up being either just looking up a field or custom property dynamically. It should also be possible to disable dot-dyn through the Rhombus equivalent of a module language for those who don't want to deal with misspelled field names.

The big win in allowing for dot transformers would be the ability to have the convenience of the dot syntax without the overhead of resolving it at runtime as well as allowing better error checking of these names.

For example, I think this could clean up implementing a non-hygienic macro like define-generics. Instead of binding gen:id, id?, and id/c, bind id.gen, id.?, and id.c. This is actually hygenic since id.gen is not a valid identifier and thus the only identifier we are binding is id which has been passed in as an argument.

Another example that is more typical would be having a more precise version of prefix-in as seen in many other languages. (require (prefix-in mod: mod)) can still break if some other imported module happens to export a binding with the same prefix. Using a dot transformer, you could instead just do something like (require (dotted mod)) which then only binds a single identifier mod.

slaymaker1907 avatar Feb 01 '21 08:02 slaymaker1907

See also #57, https://github.com/racket/rhombus-brainstorming/pull/114#issuecomment-527083622

If I understand correctly, you are proposing that https://docs.racket-lang.org/reference/reader.html#%28part._parse-cdot%29 is on by default, except that it probably should handle dot in numbers in the usual way, correct?

I'm not sure why "Rules for handling this special dot macro" are needed. It sounds like an unnecessary restriction on top of the existing #%dot protocol: the default #%dot in #lang rhombus can implement "Rules for handling this special dot macro", but I don't see why we need to force the rules to users. They should have an ability to override #%dot to make it do whatever they want.

Should a.b.c turn to (#%dot (#%dot a b) c) or (#%dot a b c)? It looks like in most cases, a will contain binding information, while b and c are just plain symbols. Because of that, the second form, which easily provides an access to a looks better to me. From (#%dot a b c), you can also turn it into (#%dot* (#%dot* a b) c) if you really want to. The opposite direction (turning (#%dot (#%dot a b) c) to (#%dot* a b c)) is also possible, but more awkward.

sorawee avatar Feb 01 '21 12:02 sorawee

@slaymaker1907 The system you describe is very similar to how I implemented things in Remix

https://github.com/jeapostrophe/remix/blob/master/stx.rkt#L190-L235

https://github.com/jeapostrophe/remix/blob/master/tests/static-interface.rkt

jeapostrophe avatar Feb 01 '21 16:02 jeapostrophe

@jeapostrophe I was pretty sure it wasn't an original idea, just wanted to make an official proposal for Rhombus. Great news that there is some precedent for this.

@sorawee the reason why it would be a better idea to do a cooperative transformer is because then libraries can extend this functionality without conflicting with each other. Right now there wouldn't be a good way for both define-macro and define-struct to have special dot behavior. If these aren't cooperative, then you can't write hygenic macros using the dot syntax since you need to also provide a binding for #%dot which would conflict with other libraries' definitions of #%dot.

The reason why it should be parsed as a proper tree is so that a can decide how to handle b and then b can decide how to handle c. You could possibly combine this into a single macro, but then it doesn't seem as structured IMO. It should be possible to write a macro which binds a.b and a.c and delegates further handling of the dot to b and c. It also makes sense to have something like this to allow for resolving dots both as macros and at runtime. These are not just plain symbols since a.b resolves to an identifier which can in turn have its own dot transformer.

After some thought, I do think @sorawee does have a point that it is easier to write #%dot from #%dot*. I like the independence of the former, but it's not too bad to convert it to the full tree form.

Also, I recognize that these names are all terrible, just wanted to explain the concept. Continuing to use #%cdot or adding #%dot would be better.

slaymaker1907 avatar Feb 01 '21 18:02 slaymaker1907

the reason why it would be a better idea to do a cooperative transformer is because then libraries can extend this functionality without conflicting with each other.

I'm not opposed to cooperative transformer. But I think it's better to leave that for the default #%dot from #lang rhombus, and allow users to customize #%dot however they want.

since you need to also provide a binding for #%dot which would conflict with other libraries' definitions of #%dot.

Allowing users to provide their own #%dot doesn't necessarily need to cause conflicts. Take a look at #%app as an example. When users provide customized #%app, most function applications still work as intended, because usually users' #%app just falls back to #lang racket's #%app. Same for #%module-begin, #%top-interaction, etc. I don't see how #%dot would be different.

The reason why it should be parsed as a proper tree is so that a can decide how to handle b and then b can decide how to handle c

Note that #%dot is a macro, which is normally expanded outside-in. You can use local-expand to force the inside-out expansion, but that seems more inconvenient than the flat structure.

sorawee avatar Feb 03 '21 00:02 sorawee

@slaymaker1907 Sorting out a dot protocol is an open issue in the prototype at #163. So, unless that prototype seems like entirely the wrong direction for some other reason, you might have some thoughts to add there.

I particularly agree with your suggestion that . should avoid runtime resolution, and I like you ideas how how . can be used to for compound names. For example, I adopted that strategy in the prototype for going from a structure-type name to an accessor: Posn.x instead of binding something like Posn_x.

mflatt avatar Aug 02 '21 23:08 mflatt

@mflatt I'll take a look at that proposal. I was unaware that there was already a prototype for this.

slaymaker1907 avatar Oct 13 '21 18:10 slaymaker1907