hy icon indicating copy to clipboard operation
hy copied to clipboard

A macro can't expand in the same module it's defined in if it uses a global from it.

Open gilch opened this issue 6 years ago • 35 comments

This works at the repl.

=> (setv x "Spam!")
=> (defmacro foo [] `(print ~x))
<function <lambda> at 0x000001D58AE79F28>
=> (foo)
Spam!

But if you put it in a module,

;;; foo.hy

(setv x "Spam!")

(defmacro foo []
  `(print ~x))

(foo)

Then you can't import it!

=> (import foo)
import foo
None
  File "c:\Users\ME\Documents\GitHub\hy/foo.hy", line 6, column 1

  (foo)
  ^---^
HyMacroExpansionError: b'expanding `foo\': NameError("name \'x\' is not defined",)'

The global x doesn't exist in the macro function's __globals__ attr.

Possibly related to #1211 or #1134.

If you omit the last line from the module, then you can require the macro from the repl, and it works.

I'm not sure what's causing this, but it seems like a flaw in our import process. If we could compile them incrementally--one toplevel form at a time, like the repl does--this should work.

gilch avatar Sep 06 '17 05:09 gilch

Here's an interesting illustration.

=> (do (setv x "Spam!")(defmacro foo[] `(print ~x))(foo))
  File "<input>", line 1, column 49

  (do (setv x "Spam!")(defmacro foo[] `(print ~x))(foo))
                                                  ^---^
HyMacroExpansionError: b'expanding `foo\': NameError("name \'x\' is not defined",)'

This version fails even at the repl.

Clojure can handle this though.

user=> (do (def x "Spam!") (defmacro foo [] `(println ~x)) (foo))
Spam!
nil

EDIT: looks like the do isn't required.

=> (setv x "Spam!") (defmacro foo [] `(print ~x)) (foo)
  File "<input>", line 1, column 48

  (setv x "Spam!") (defmacro foo [] `(print ~x)) (foo)
                                                 ^---^
HyMacroExpansionError: b'expanding `foo\': NameError("name \'x\' is not defined",)'

gilch avatar Sep 06 '17 06:09 gilch

That's pretty weird. x should indeed be in scope. By the way, running the file directly, instead of importing it, gives the same error.

looks like the do isn't required.

It shouldn't affect much of anything—import_buffer_to_hst already wraps its argument in do.

Kodiologist avatar Sep 06 '17 20:09 Kodiologist

Wait a minute, I didn't think this through. x is in fact not in scope in the examples that don't work. Take, for example,

(setv x "Spam!")

(defmacro foo []
  `(print ~x))

(foo)

If the macro were defined to expand to '(print x), it would work fine. But the ~ causes x to be evaluated at compile-time, not runtime, and x isn't defined at compile-time, because (setv x "Spam!") hasn't been run yet.

The reason the REPL example works (when the statements are entered separately) is that it forces (setv x "Spam!") to be evaluated before (foo), contrary to the usual procedure of expanding macros before running any non-macro code.

So, everything seems to be correct here.

Kodiologist avatar Sep 09 '17 15:09 Kodiologist

So how are we feeling about this?

Kodiologist avatar Sep 12 '17 00:09 Kodiologist

I don't think this is correct at all, considering how other Lisps work, hence the "bug" label. If this was the intended design, it's insane. I think we simply didn't notice since we mostly develop small programs at the repl.

It's very common and useful to design macros that delegate part of the expansion process to helper functions. Macros don't compose as naturally as functions do, so if you want to reuse macro expansion code, it's often best to put it in a shared helper function. Where are we supposed to put those if not in the same module?

gilch avatar Sep 12 '17 00:09 gilch

Delegating the expansion process to helper functions in the same module works fine, provided you only invoke the macro after the file is compiled, rather than inside the very same file:

$ echo >foo.hy '(defn f [] `3) (defmacro m [] (f))'
$ hy -c '(require foo) (print (foo.m))'      
3

Here, f exists by the time m is called. By contrast, the only way (defn f [] `3) (defmacro m [] (f)) (m) can work is if the defn is executed before (m) is expanded into (f). What semantics would you propose here? The way Hy currently works is that macro expansion happens during compilation, and only after a file is compiled is any regular code executed.

I've never figured out how these gory details work in other Lisps, so I can't speak for them.

Kodiologist avatar Sep 12 '17 00:09 Kodiologist

Delegating the expansion process to helper functions in the same module works fine, provided you only invoke the macro after the file is compiled, rather than inside the very same file [...] I've never figured out how these gory details work in other Lisps, so I can't speak for them.

Good to know, but Clojure doesn't have that restriction.

$ echo '(defn f [] `3) (defmacro m [] (f)) (println (m))' > foo.clj
$ clojure foo.clj
3

I don't think Hy needs to have it either.

What semantics would you propose here? The way Hy currently works is that macro expansion happens during compilation, and only after a file is complied is any regular code executed.

At minimum, compile the files incrementally, one top-level form at a time. You don't macroexpand the whole file at once. It's almost like putting each top-level form in its own module, but we'd reuse the same module globals dict for each step. This is basically what the repl does.

I'm not sure how to handle the case when a defmacro is not at top level, but I don't think you're supposed to do that in other Lisps anyway. They have macrolet for that. #900.

gilch avatar Sep 12 '17 02:09 gilch

At minimum, compile the files incrementally, one top-level form at a time.

That already happens. What you'd need is to execute the top-level forms incrementally, too, rather than compiling the whole file and then executing it.

Kodiologist avatar Sep 12 '17 02:09 Kodiologist

In any case, this seems to be a design question rather than a bug, so I'm retagging to complaint / disgust.

Kodiologist avatar Sep 13 '17 16:09 Kodiologist

Why do we even have _compile_time_hack if we can't use globals from the same file?

gilch avatar Sep 18 '17 21:09 gilch

You mean, what does it allow you to do? To call the macro in the same file it's defined in. The macro will work when called like this provided it doesn't rely on normal code in the same file (like defns) having already been executed.

By the way, I don't think the distinction between globals and locals is relevant here. You can't use what would be a local variable, because it isn't defined yet, either. For example, (defn f [] (setv x 1) (defmacro m [] x) (m)) raises NameError.

Kodiologist avatar Sep 19 '17 00:09 Kodiologist

I included an example in #1430 showing how to use eval-and-compile or eval-when-compile to make macro subroutines.

Kodiologist avatar Sep 20 '17 00:09 Kodiologist

Of course macro expansion occurs at compile time instead of runtime, that's the point of having macros in the first place.

gilch avatar Apr 20 '18 03:04 gilch

Indeed.

Kodiologist avatar Apr 20 '18 03:04 Kodiologist

N.B. You can use eval-and-compile for variables, too.

Kodiologist avatar Apr 20 '18 03:04 Kodiologist

I'll have to try that. It might be enough to bring #1328 back.

gilch avatar Apr 20 '18 03:04 gilch

Where are we with this; is it a desired feature? I think it's well worth having, so, if I put in a PR that does it, how will it go over?

brandonwillard avatar Oct 14 '18 17:10 brandonwillard

We all agree that it's desirable, all things being equal, but it seems to entail deep changes to Hy's compilation process that I think would be worse than the original problem.

As things are now, macro expansion happens at compile-time, and a whole module is compiled before any of it is run. This means that you can translate a file that calls macros into Python, or compile it into bytecode, and all the macros have been resolved by the time you have your Python code or bytecode. The Python or bytecode doesn't have the macro calls; it has the actual expanded code. I think that's cool. It means you can move some long-running computation that only needs to happen once to compile-time. It also means you can do arbitrarily fancy things with macros and, if you're careful, still produce Python code that doesn't depend on Hy at all.

If we instead move to an incrementally compiling system, then the produced Python code or bytecode, in order to be equivalent to the original Hy, would need to keep all the macro calls, wouldn't it? I think that's what other Lisps do when you compile a file.

Kodiologist avatar Oct 14 '18 17:10 Kodiologist

I don't think this functionality necessitates incremental compilation, but it would make it easy to implement (or automatically available) by allowing macros to behave like the basic Python functions with which they're implemented (e.g. https://github.com/hylang/hy/issues/1689).

In a sense, incremental compilation could be similar to an always-on eval-and-compile, and, currently, eval-and-compile doesn't appear to break any of the functionality you described.

Also, I don't see how incremental evaluation would change, or require changes to, the generated AST and resulting bytecode. The AST is evaluated, via Python, incrementally (I've been using the term "incrementally" to also mean "evaluated as Python would"), so, if anything, it would bring the compile and run-time behavior closer together. As a result, there would be fewer surprises and debugging would be drastically better.

Perhaps I'm missing something, though.

brandonwillard avatar Oct 14 '18 20:10 brandonwillard

Wrapping the whole file in eval-and-compile doesn't work; you'd need to put each top-level form in its own eval-and-compile instead.

It may well be at this point that your understanding of Hy's compilation process is beyond mine (because you've been working with it so intensely recently, whereas I've cooled it with the PRs since I started at Mount Sinai). Let's consider an example to make some of the complexities here concrete. I'll call it sandwich.hy:

(print "Top")
(setv x [1])

(defmacro m []
  (import time)
  (time.sleep 3)
  (.append x 2)
  x)

(print (m))
(print x)
(print "Bottom")

Currently, this program doesn't work (because m can't see x). What if we wrap each top-level form in an eval-and-compile? This yields wrap.hy:

(eval-and-compile (print "Top"))
(eval-and-compile (setv x [1]))

(eval-and-compile (defmacro m []
  (import time)
  (time.sleep 3)
  (.append x 2)
  x))

(eval-and-compile (print (m)))
(eval-and-compile (print x))
(eval-and-compile (print "Bottom"))

hy wrap.hy prints:

Top
[1, 2]
[1, 2, 2]
Bottom
Top
[1, 2, 2]
[1]
Bottom

on its first run, including two sleeps of 3 seconds, and

Top
[1, 2, 2]
[1]
Bottom

on its second (using the bytecode), including no sleeps.

What do you think sandwich.hy ought to do on each of its first and second runs?

Kodiologist avatar Oct 15 '18 01:10 Kodiologist

Of course that wouldn't work; the code is evaluated too many times (and in different modules/namespaces) during the course of compilation and the superfluous eval of the resulting AST!

The analogy with eval-and-compile (of top-level forms) is only useful for demonstrating how evaluation at compile-time can be done and doesn't break functionality in any fundamental way; however, the incremental compiler we're talking about isn't as simple as that.

brandonwillard avatar Oct 15 '18 16:10 brandonwillard

Are you saying that sandwich.hy would still crash with your proposed changes? If so, how would it crash? I presume it wouldn't be the same as what currently happens, a NameError for x during the expansion of m, because allowing that kind of reference is the purpose of the change.

Kodiologist avatar Oct 15 '18 17:10 Kodiologist

No specific changes have been proposed, yet; I'm only discussing broad functionality and its implications.

Otherwise, relative to the subject of this issue, any changes would necessarily allow the macro in sandwich.hy to resolve x.
Also, I assume the expected output—in all cases—would be:

Top
[1 2]
[1 2]
Bottom

brandonwillard avatar Oct 15 '18 17:10 brandonwillard

In order for (print x) to print [1, 2], the macro would have to have been expanded recently enough for the (.append x 2) to still be in effect. That's the sort of thing that (so far as I can see) can't happen if bytecode has already had macros expanded. Instead of (print (m)), the effective code at that point will be (print [1 2]), so the bytecode doesn't even contain (.append x 2).

Kodiologist avatar Oct 15 '18 17:10 Kodiologist

Yes, currently, the compiler produces AST for (print (m)) reflecting (print [1 2]) and not something like (print (do (.append x 2) x)).

I don't see why we couldn't make it produce the latter, if desired, but I also don't see the issue with the current result (relative to this topic). How would it affect our ability to resolve x during macro expansion, or vice versa?

brandonwillard avatar Oct 15 '18 18:10 brandonwillard

I don't see why we couldn't make it produce the later, if desired

I assume you mean produce something like (print (do (import time) (time.sleep 3) (.append x 2) x)). That's why it's undesirable: the program needs to do the work of expanding macros (represented in this case by the sleep time) every time you run the program.

but I also don't see the issue with the current result (relative to this topic). How would it affect our ability to resolve x during macro expansion, or vice versa?

If m can see x, it can modify it. If m can modify x, later expressions, like (print x), can be affected by whether m was expanded during the current run. If (print x) can be affected by whether m was expanded during the current run, then either (print x) can print different things on the first run and the second run, or m needs to be re-expanded on every run. Does that make sense?

Kodiologist avatar Oct 15 '18 18:10 Kodiologist

That's why it's undesirable: the program needs to do the work of expanding macros (represented in this case by the sleep time) every time you run the program.

It seems like you're assuming that result is also necessary for this functionality, and I don't believe it is.

If m can see x, it can modify it. If m can modify x, later expressions, like (print x), can be affected by whether m was expanded during the current run.

Yes, this is true of all function calls with side effects—macros included. Is there something about accessing module-scoped variables in macros with side effects that's particularly problematic?

brandonwillard avatar Oct 15 '18 18:10 brandonwillard

The subsequent sentence states the two problematic possibilities: "either (print x) can print different things on the first run and the second run, or m needs to be re-expanded on every run".

If you can think of a way for m to see x without such things happening, I'm all ears.

Kodiologist avatar Oct 15 '18 18:10 Kodiologist

The subsequent sentence states the two problematic possibilities: "either (print x) can print different things on the first run and the second run, or m needs to be re-expanded on every run".

I don't think it's reasonable to burden Hy with a responsibility to save people from the potential confusions resulting from side effects (in macros or otherwise), but we could make it possible to produce the same results.

For instance, a decorator and/or macro that directs the compiler to not produce AST of the expanded results. It could be made to only affect the AST generation for that one macro or location and still use the expanded results of any macros it calls.

To illustrate, let's call such a macro eval-always. It would behave as follows:

;; Regular macro
(defmacro a [x]
  (+ 1 1 x))

;; Tell the compiler that this macro shouldn't fully expand its AST results
(eval-always
  (defmacro m []
    (a 1)))

;; Generates AST like `(print 3)`
(print (a 1))

;; Generates AST like `(print (+ 1 1 1))`
(print (m))

;; Generates AST like `(print (+ 1 1 1))`
(eval-always
  (print (a 1))

brandonwillard avatar Oct 15 '18 21:10 brandonwillard

Huh. So, it sounds like, of the two "problematic possibilities", you're taking the first ("(print x) can print different things on the first run and the second run"), and so you would have hy sandwich.hy print

 Top
 [1, 2]
 [1, 2]
 Bottom

, with the sleep, on its first run, but

 Top
 [1, 2]
 [1]
 Bottom

, without the sleep, on its second run. (Taking possibility 2, expanding macros on every run, would print [1, 2] twice on both the first and second run. Which is what you previously said should be expected.) Is that correct?

Kodiologist avatar Oct 15 '18 21:10 Kodiologist