hy
hy copied to clipboard
`#_`, `#*`, and `#**` should probably be reader macros
Right now they're hard-coded into HyReader
.
In fact, as things stand, the way something like #_foo
or #*foo
is parsed depends on whether a reader macro named _foo
or #*foo
happens to exist:
=> (setv foo [1 2 3])
=> (print #*foo)
1 2 3
=> (defreader *foo 5)
=> (print #*foo)
5
Hmm, I know one of the iterations of the reader macro dispatcher would prefer single-character readers over identifiers starting with that character, precisely so you could dispatch things like #*args
or #%val
. I like having that as a feature, but it might be a little unintuitive if you're not expecting it?
I think that would get confusing once you have something like one reader macro named a
and another named all
. Is #all-boots
parsed as #a ll-boots
or #all -boots
?
This seems like the most relevant place to discuss single-character tag dispatch before I get back to #2426
So an interesting (and likely unintentional) quirk of tag dispatch is that a user can never actually define one of these single-character readers:
=> (defreader w '2)
=> (setv eirdness 3)
=> (print #w eirdness)
2 3
=> (print #weirdness)
hy.reader.exceptions.LexException: reader macro '#weirdness' is not defined
(You can also test this with the #s
and #%
macros from hyrule)
The reason it's like this is because of mangling quirks:
=> (eval-when-compile (print (-> hy.&reader.reader-table (.keys) list)))
[
# ...other keys...
'#_',
'#*',
'hyx_Xnumber_signXXpercent_signX',
'hyx_Xnumber_signXs',
'hyx_Xnumber_signXw',
'hyx_Xnumber_signX_uhoh'
]
In HyReader.tag_dispatch()
, we first check for the mangled full symbol, and failing that we check for literal "#" plus one unmangled character. So to get your own single-character reader, you would have to manually insert your reader function into the table, since defreader
always mangles the tag name.
I like the aesthetics of #*args
over #* args
, but while typing out this comment I realized that (unless we start special-casing things) this would allow expressions like #_uhoh
to actually produce something if eg (defreader _uhoh "badbadbad")
has been defined, which seems like a big no-no.
So uh, I guess I actually agree with @Kodiologist even though I started this comment as a counterargument. I'll start a PR to remove the single-character dispatching.
Kek, yeah, I've had that experience of convincing myself the other way in the middle of trying to sort out this kind of issue.
I would be open to special-casing #*
, #**
, and #_
, but I wouldn't really recommend it.
Arguably the keys of the reader table shouldn't include the #
. The #
is notionally the syntax that you use to call the reader macro, not part of the reader macro's name. Right? Like how in the POSIX shell, you assign variables like FOO=1
but get their values like $FOO
; the dollar sign isn't part of the variable name.
Since regular macros and reader macros get separate namespaces as well as seperate syntax for defining them, require
ing them, etc., there shouldn't be issues with having both a regular macro and a reader macro named foo
.
Nope, I'm going back on this again after having implemented it and updating all the tests. The spaces look awful and make it harder to visually parse.
With something like (print a b c #**kwargs)
, it's much easier than
(print a b c #** kwargs)
to (immediately) see that kwargs
is getting splatted.
This also becomes pertinent for #^
annotations:
(setv #^int x 4)
is visually more representative of intent than
(setv #^ int x 4)
, where it looks almost like the annotation is part of a separate assignment.
In any case, we already have to special-case the hash-sequences #(...) #{...} #[[...]]
since they don't parse as identifiers, so we might as well make an explicit set of "single-char" reader macros that are separate from "identifier" tag macros.
We can solve the shadowing issue by explicitly checking for shadowing; eg (defreader _foo '2)
will raise an error if #_
is a single-char reader macro, and similarly trying to define eg #s
explicitly as a single-char dispatching macro will fail if eg #slice
was already a reader macro.
Just as a note, (defreader s ...)
would still declare an "identifier" macro in my proposal; to define a single-char macro one would have to use either a new form (maybe (defreader! ...)
?) or insert it manually.
I have a prototype I can post as a PR after I clean it up more, if this is amenable
Aw geez, we already have both regular macros and reader macros. I think adding a third type of macro solely to ease a perceived whitespace issue is going too far. Imagine requiring macros from two different modules and getting a surprise error when the reader macros of one conflict with the single-character-dispatch macros of the other.
Anyway, I prefer #** kwargs
to #**kwargs
and already use it thus. #**kwargs
looks weird because *
is an identifier character in Hy, so this should be one construct, not two. (Regarding annotations, I can't say whether #^int
or #^ int
is better considering that annotations are almost entirely useless in Hy anyway.)
So if you're dead-set on being able to say #**kwargs
, let's special-case #**
in the reader.
After all, it is Lisp; syntactic regularity over readability is a choice we should be used to making.
@scauligi This issue is fulfilled, right? Since HyReader.__init__
sets up the reader_for
methods as reader macros.
Sure, we can mark it as complete; I wasn't sure if you wanted the Hy tag macros moved into defreader
s in hy/core
since these are still hardcoded in HyReader
in some sense.
As I understand, the current setup should work just fine if we later add some ability to introspect or override core reader macros. Whether they're written in Hy or Python is less important. So I think we're good.