hy `#_`, `#*`, and `#**` should probably be reader macros

`#_`, `#*`, and `#**` should probably be reader macros

Open Kodiologist opened this issue 2 years ago • 3 comments

Right now they're hard-coded into HyReader.

Jun 10 '22 17:06 Kodiologist

In fact, as things stand, the way something like #_foo or #*foo is parsed depends on whether a reader macro named _foo or #*foo happens to exist:

=> (setv foo [1 2 3])
=> (print #*foo)
1 2 3
=> (defreader *foo 5)
=> (print #*foo)
5

Jun 12 '22 19:06 Kodiologist

Hmm, I know one of the iterations of the reader macro dispatcher would prefer single-character readers over identifiers starting with that character, precisely so you could dispatch things like #*args or #%val. I like having that as a feature, but it might be a little unintuitive if you're not expecting it?

Jun 12 '22 20:06 scauligi

I think that would get confusing once you have something like one reader macro named a and another named all. Is #all-boots parsed as #a ll-boots or #all -boots?

Jun 12 '22 20:06 Kodiologist

This seems like the most relevant place to discuss single-character tag dispatch before I get back to #2426

So an interesting (and likely unintentional) quirk of tag dispatch is that a user can never actually define one of these single-character readers:

=> (defreader w '2)
=> (setv eirdness 3)
=> (print #w eirdness)
2 3
=> (print #weirdness)
hy.reader.exceptions.LexException: reader macro '#weirdness' is not defined

(You can also test this with the #s and #% macros from hyrule)

The reason it's like this is because of mangling quirks:

=> (eval-when-compile (print (-> hy.&reader.reader-table (.keys) list)))
[
    # ...other keys...
    '#_',
    '#*',
    'hyx_Xnumber_signXXpercent_signX',
    'hyx_Xnumber_signXs',
    'hyx_Xnumber_signXw',
    'hyx_Xnumber_signX_uhoh'
]

In HyReader.tag_dispatch(), we first check for the mangled full symbol, and failing that we check for literal "#" plus one unmangled character. So to get your own single-character reader, you would have to manually insert your reader function into the table, since defreader always mangles the tag name.

I like the aesthetics of #*args over #* args, but while typing out this comment I realized that (unless we start special-casing things) this would allow expressions like #_uhoh to actually produce something if eg (defreader _uhoh "badbadbad") has been defined, which seems like a big no-no.

So uh, I guess I actually agree with @Kodiologist even though I started this comment as a counterargument. I'll start a PR to remove the single-character dispatching.

May 22 '23 15:05 scauligi

Kek, yeah, I've had that experience of convincing myself the other way in the middle of trying to sort out this kind of issue.

I would be open to special-casing #*, #**, and #_, but I wouldn't really recommend it.

Arguably the keys of the reader table shouldn't include the #. The # is notionally the syntax that you use to call the reader macro, not part of the reader macro's name. Right? Like how in the POSIX shell, you assign variables like FOO=1 but get their values like $FOO; the dollar sign isn't part of the variable name.

May 22 '23 15:05 Kodiologist

Since regular macros and reader macros get separate namespaces as well as seperate syntax for defining them, requireing them, etc., there shouldn't be issues with having both a regular macro and a reader macro named foo.

May 22 '23 15:05 Kodiologist

Nope, I'm going back on this again after having implemented it and updating all the tests. The spaces look awful and make it harder to visually parse. With something like (print a b c #**kwargs), it's much easier than (print a b c #** kwargs) to (immediately) see that kwargs is getting splatted. This also becomes pertinent for #^ annotations: (setv #^int x 4) is visually more representative of intent than (setv #^ int x 4), where it looks almost like the annotation is part of a separate assignment.

In any case, we already have to special-case the hash-sequences #(...) #{...} #[[...]] since they don't parse as identifiers, so we might as well make an explicit set of "single-char" reader macros that are separate from "identifier" tag macros. We can solve the shadowing issue by explicitly checking for shadowing; eg (defreader _foo '2) will raise an error if #_ is a single-char reader macro, and similarly trying to define eg #s explicitly as a single-char dispatching macro will fail if eg #slice was already a reader macro.

Just as a note, (defreader s ...) would still declare an "identifier" macro in my proposal; to define a single-char macro one would have to use either a new form (maybe (defreader! ...)?) or insert it manually.

I have a prototype I can post as a PR after I clean it up more, if this is amenable

May 23 '23 21:05 scauligi

Aw geez, we already have both regular macros and reader macros. I think adding a third type of macro solely to ease a perceived whitespace issue is going too far. Imagine requiring macros from two different modules and getting a surprise error when the reader macros of one conflict with the single-character-dispatch macros of the other.

Anyway, I prefer #** kwargs to #**kwargs and already use it thus. #**kwargs looks weird because * is an identifier character in Hy, so this should be one construct, not two. (Regarding annotations, I can't say whether #^int or #^ int is better considering that annotations are almost entirely useless in Hy anyway.)

So if you're dead-set on being able to say #**kwargs, let's special-case #** in the reader.

May 23 '23 21:05 Kodiologist

After all, it is Lisp; syntactic regularity over readability is a choice we should be used to making.

May 23 '23 21:05 Kodiologist

@scauligi This issue is fulfilled, right? Since HyReader.__init__ sets up the reader_for methods as reader macros.

Jun 03 '23 15:06 Kodiologist

Sure, we can mark it as complete; I wasn't sure if you wanted the Hy tag macros moved into defreaders in hy/core since these are still hardcoded in HyReader in some sense.

Jun 03 '23 19:06 scauligi

As I understand, the current setup should work just fine if we later add some ability to introspect or override core reader macros. Whether they're written in Hy or Python is less important. So I think we're good.

Jun 03 '23 19:06 Kodiologist

hy hy copied to clipboard

`#_`, `#*`, and `#**` should probably be reader macros

hy
hy copied to clipboard