hy
hy copied to clipboard
A macro can't expand in the same module it's defined in if it uses a global from it.
This works at the repl.
=> (setv x "Spam!")
=> (defmacro foo [] `(print ~x))
<function <lambda> at 0x000001D58AE79F28>
=> (foo)
Spam!
But if you put it in a module,
;;; foo.hy
(setv x "Spam!")
(defmacro foo []
`(print ~x))
(foo)
Then you can't import it!
=> (import foo)
import foo
None
File "c:\Users\ME\Documents\GitHub\hy/foo.hy", line 6, column 1
(foo)
^---^
HyMacroExpansionError: b'expanding `foo\': NameError("name \'x\' is not defined",)'
The global x
doesn't exist in the macro function's __globals__
attr.
Possibly related to #1211 or #1134.
If you omit the last line from the module, then you can require
the macro from the repl, and it works.
I'm not sure what's causing this, but it seems like a flaw in our import process. If we could compile them incrementally--one toplevel form at a time, like the repl does--this should work.
Here's an interesting illustration.
=> (do (setv x "Spam!")(defmacro foo[] `(print ~x))(foo))
File "<input>", line 1, column 49
(do (setv x "Spam!")(defmacro foo[] `(print ~x))(foo))
^---^
HyMacroExpansionError: b'expanding `foo\': NameError("name \'x\' is not defined",)'
This version fails even at the repl.
Clojure can handle this though.
user=> (do (def x "Spam!") (defmacro foo [] `(println ~x)) (foo))
Spam!
nil
EDIT: looks like the do
isn't required.
=> (setv x "Spam!") (defmacro foo [] `(print ~x)) (foo)
File "<input>", line 1, column 48
(setv x "Spam!") (defmacro foo [] `(print ~x)) (foo)
^---^
HyMacroExpansionError: b'expanding `foo\': NameError("name \'x\' is not defined",)'
That's pretty weird. x
should indeed be in scope. By the way, running the file directly, instead of import
ing it, gives the same error.
looks like the
do
isn't required.
It shouldn't affect much of anything—import_buffer_to_hst
already wraps its argument in do
.
Wait a minute, I didn't think this through. x
is in fact not in scope in the examples that don't work. Take, for example,
(setv x "Spam!")
(defmacro foo []
`(print ~x))
(foo)
If the macro were defined to expand to '(print x)
, it would work fine. But the ~
causes x
to be evaluated at compile-time, not runtime, and x
isn't defined at compile-time, because (setv x "Spam!")
hasn't been run yet.
The reason the REPL example works (when the statements are entered separately) is that it forces (setv x "Spam!")
to be evaluated before (foo)
, contrary to the usual procedure of expanding macros before running any non-macro code.
So, everything seems to be correct here.
So how are we feeling about this?
I don't think this is correct at all, considering how other Lisps work, hence the "bug" label. If this was the intended design, it's insane. I think we simply didn't notice since we mostly develop small programs at the repl.
It's very common and useful to design macros that delegate part of the expansion process to helper functions. Macros don't compose as naturally as functions do, so if you want to reuse macro expansion code, it's often best to put it in a shared helper function. Where are we supposed to put those if not in the same module?
Delegating the expansion process to helper functions in the same module works fine, provided you only invoke the macro after the file is compiled, rather than inside the very same file:
$ echo >foo.hy '(defn f [] `3) (defmacro m [] (f))'
$ hy -c '(require foo) (print (foo.m))'
3
Here, f
exists by the time m
is called. By contrast, the only way (defn f [] `3) (defmacro m [] (f)) (m)
can work is if the defn
is executed before (m)
is expanded into (f)
. What semantics would you propose here? The way Hy currently works is that macro expansion happens during compilation, and only after a file is compiled is any regular code executed.
I've never figured out how these gory details work in other Lisps, so I can't speak for them.
Delegating the expansion process to helper functions in the same module works fine, provided you only invoke the macro after the file is compiled, rather than inside the very same file [...] I've never figured out how these gory details work in other Lisps, so I can't speak for them.
Good to know, but Clojure doesn't have that restriction.
$ echo '(defn f [] `3) (defmacro m [] (f)) (println (m))' > foo.clj
$ clojure foo.clj
3
I don't think Hy needs to have it either.
What semantics would you propose here? The way Hy currently works is that macro expansion happens during compilation, and only after a file is complied is any regular code executed.
At minimum, compile the files incrementally, one top-level form at a time. You don't macroexpand the whole file at once. It's almost like putting each top-level form in its own module, but we'd reuse the same module globals dict for each step. This is basically what the repl does.
I'm not sure how to handle the case when a defmacro
is not at top level, but I don't think you're supposed to do that in other Lisps anyway. They have macrolet
for that. #900.
At minimum, compile the files incrementally, one top-level form at a time.
That already happens. What you'd need is to execute the top-level forms incrementally, too, rather than compiling the whole file and then executing it.
In any case, this seems to be a design question rather than a bug, so I'm retagging to complaint / disgust.
Why do we even have _compile_time_hack
if we can't use globals from the same file?
You mean, what does it allow you to do? To call the macro in the same file it's defined in. The macro will work when called like this provided it doesn't rely on normal code in the same file (like defn
s) having already been executed.
By the way, I don't think the distinction between globals and locals is relevant here. You can't use what would be a local variable, because it isn't defined yet, either. For example, (defn f [] (setv x 1) (defmacro m [] x) (m))
raises NameError
.
I included an example in #1430 showing how to use eval-and-compile
or eval-when-compile
to make macro subroutines.
Of course macro expansion occurs at compile time instead of runtime, that's the point of having macros in the first place.
Indeed.
N.B. You can use eval-and-compile
for variables, too.
I'll have to try that. It might be enough to bring #1328 back.
Where are we with this; is it a desired feature? I think it's well worth having, so, if I put in a PR that does it, how will it go over?
We all agree that it's desirable, all things being equal, but it seems to entail deep changes to Hy's compilation process that I think would be worse than the original problem.
As things are now, macro expansion happens at compile-time, and a whole module is compiled before any of it is run. This means that you can translate a file that calls macros into Python, or compile it into bytecode, and all the macros have been resolved by the time you have your Python code or bytecode. The Python or bytecode doesn't have the macro calls; it has the actual expanded code. I think that's cool. It means you can move some long-running computation that only needs to happen once to compile-time. It also means you can do arbitrarily fancy things with macros and, if you're careful, still produce Python code that doesn't depend on Hy at all.
If we instead move to an incrementally compiling system, then the produced Python code or bytecode, in order to be equivalent to the original Hy, would need to keep all the macro calls, wouldn't it? I think that's what other Lisps do when you compile a file.
I don't think this functionality necessitates incremental compilation, but it would make it easy to implement (or automatically available) by allowing macros to behave like the basic Python functions with which they're implemented (e.g. https://github.com/hylang/hy/issues/1689).
In a sense, incremental compilation could be similar to an always-on eval-and-compile
, and, currently, eval-and-compile
doesn't appear to break any of the functionality you described.
Also, I don't see how incremental evaluation would change, or require changes to, the generated AST and resulting bytecode. The AST is evaluated, via Python, incrementally (I've been using the term "incrementally" to also mean "evaluated as Python would"), so, if anything, it would bring the compile and run-time behavior closer together. As a result, there would be fewer surprises and debugging would be drastically better.
Perhaps I'm missing something, though.
Wrapping the whole file in eval-and-compile
doesn't work; you'd need to put each top-level form in its own eval-and-compile
instead.
It may well be at this point that your understanding of Hy's compilation process is beyond mine (because you've been working with it so intensely recently, whereas I've cooled it with the PRs since I started at Mount Sinai). Let's consider an example to make some of the complexities here concrete. I'll call it sandwich.hy
:
(print "Top")
(setv x [1])
(defmacro m []
(import time)
(time.sleep 3)
(.append x 2)
x)
(print (m))
(print x)
(print "Bottom")
Currently, this program doesn't work (because m
can't see x
). What if we wrap each top-level form in an eval-and-compile
? This yields wrap.hy
:
(eval-and-compile (print "Top"))
(eval-and-compile (setv x [1]))
(eval-and-compile (defmacro m []
(import time)
(time.sleep 3)
(.append x 2)
x))
(eval-and-compile (print (m)))
(eval-and-compile (print x))
(eval-and-compile (print "Bottom"))
hy wrap.hy
prints:
Top
[1, 2]
[1, 2, 2]
Bottom
Top
[1, 2, 2]
[1]
Bottom
on its first run, including two sleeps of 3 seconds, and
Top
[1, 2, 2]
[1]
Bottom
on its second (using the bytecode), including no sleeps.
What do you think sandwich.hy
ought to do on each of its first and second runs?
Of course that wouldn't work; the code is evaluated too many times (and in different modules/namespaces) during the course of compilation and the superfluous eval
of the resulting AST!
The analogy with eval-and-compile
(of top-level forms) is only useful for demonstrating how evaluation at compile-time can be done and doesn't break functionality in any fundamental way; however, the incremental compiler we're talking about isn't as simple as that.
Are you saying that sandwich.hy
would still crash with your proposed changes? If so, how would it crash? I presume it wouldn't be the same as what currently happens, a NameError
for x
during the expansion of m
, because allowing that kind of reference is the purpose of the change.
No specific changes have been proposed, yet; I'm only discussing broad functionality and its implications.
Otherwise, relative to the subject of this issue, any changes would necessarily allow the macro in sandwich.hy
to resolve x
.
Also, I assume the expected output—in all cases—would be:
Top
[1 2]
[1 2]
Bottom
In order for (print x)
to print [1, 2]
, the macro would have to have been expanded recently enough for the (.append x 2)
to still be in effect. That's the sort of thing that (so far as I can see) can't happen if bytecode has already had macros expanded. Instead of (print (m))
, the effective code at that point will be (print [1 2])
, so the bytecode doesn't even contain (.append x 2)
.
Yes, currently, the compiler produces AST for (print (m))
reflecting (print [1 2])
and not something like (print (do (.append x 2) x))
.
I don't see why we couldn't make it produce the latter, if desired, but I also don't see the issue with the current result (relative to this topic). How would it affect our ability to resolve x
during macro expansion, or vice versa?
I don't see why we couldn't make it produce the later, if desired
I assume you mean produce something like (print (do (import time) (time.sleep 3) (.append x 2) x))
. That's why it's undesirable: the program needs to do the work of expanding macros (represented in this case by the sleep time) every time you run the program.
but I also don't see the issue with the current result (relative to this topic). How would it affect our ability to resolve
x
during macro expansion, or vice versa?
If m
can see x
, it can modify it. If m
can modify x
, later expressions, like (print x)
, can be affected by whether m
was expanded during the current run. If (print x)
can be affected by whether m
was expanded during the current run, then either (print x)
can print different things on the first run and the second run, or m
needs to be re-expanded on every run. Does that make sense?
That's why it's undesirable: the program needs to do the work of expanding macros (represented in this case by the sleep time) every time you run the program.
It seems like you're assuming that result is also necessary for this functionality, and I don't believe it is.
If m can see x, it can modify it. If m can modify x, later expressions, like (print x), can be affected by whether m was expanded during the current run.
Yes, this is true of all function calls with side effects—macros included. Is there something about accessing module-scoped variables in macros with side effects that's particularly problematic?
The subsequent sentence states the two problematic possibilities: "either (print x)
can print different things on the first run and the second run, or m
needs to be re-expanded on every run".
If you can think of a way for m
to see x
without such things happening, I'm all ears.
The subsequent sentence states the two problematic possibilities: "either
(print x)
can print different things on the first run and the second run, orm
needs to be re-expanded on every run".
I don't think it's reasonable to burden Hy with a responsibility to save people from the potential confusions resulting from side effects (in macros or otherwise), but we could make it possible to produce the same results.
For instance, a decorator and/or macro that directs the compiler to not produce AST of the expanded results. It could be made to only affect the AST generation for that one macro or location and still use the expanded results of any macros it calls.
To illustrate, let's call such a macro eval-always
. It would behave as follows:
;; Regular macro
(defmacro a [x]
(+ 1 1 x))
;; Tell the compiler that this macro shouldn't fully expand its AST results
(eval-always
(defmacro m []
(a 1)))
;; Generates AST like `(print 3)`
(print (a 1))
;; Generates AST like `(print (+ 1 1 1))`
(print (m))
;; Generates AST like `(print (+ 1 1 1))`
(eval-always
(print (a 1))
Huh. So, it sounds like, of the two "problematic possibilities", you're taking the first ("(print x)
can print different things on the first run and the second run"), and so you would have hy sandwich.hy
print
Top
[1, 2]
[1, 2]
Bottom
, with the sleep, on its first run, but
Top
[1, 2]
[1]
Bottom
, without the sleep, on its second run. (Taking possibility 2, expanding macros on every run, would print [1, 2]
twice on both the first and second run. Which is what you previously said should be expected.) Is that correct?