vadr
vadr copied to clipboard
Macro hygeine
For the purposes of this package A macro
is a function that quotes all of its arguments, operates on the quoted expressions only, never evaluating their values, and finally evaluates an expression in the parent frame. For example, bquote
can be viewed as a macro (as it could effectively work by constructing a single expression to eval in the parent frame), but transform
is not (as it evaluates its data frame argument.)
Several macros (e.g. bind
) would like to use temporary values. The problem is to avoid contaminating the namespace within which macros are called.
There is an example of a macro with a temp variable that's built into the R interpreter: <-
. When the left argument to <-
is a call rather than a simple name, for instance x[3:5] <- 13:15
, the R interpreter expands this to:
‘*tmp*‘ <- x
x <- "[<-"(‘*tmp*‘, 3:5, value=13:15)
rm(‘*tmp*‘)
(note that R also manages to return the rvalue after deleting temps; (x[3:5] <- 13:15)
evaluates to c(13,24,15)
; so this is not a complete description of <-
). ammoc
in this package addresses this need.
In this case *tmp*
is an example of a macro-internal variable. The expansion is unhygeinic; not only you can pause execution and see that there's a *tmp*
variable floating around, it will clobber anything you had assigned to *tmp*
. It also does not work in case of nested expansion; as the R Language Definition explains, a somewhat different expansion applies to x[[1]][10] <- y
.
In general, I would like to avoid namespace clobbering problems.
Is this possible? The gtools
package has a defmacro
that uses a gensym, (a) slow and (b) requires inspecting the calling environment to see what names are in use. Further, I think it can break in case of nested macros (gensym may generate a symbol, then the expanded macro may contain a macro that generates the same symbol if the first one hasn't been assigned to yet.) THe problem is that not all symbols that would potentially conflict are defined at the time of expansion.
There is also the problem where you might want to use a function in the expansion of a macro which is overridden in the target environment.
Perhaps could look at how syntax-case works in Scheme.
Namespace problems with function calls can be avoided completely if you template in the literal function objects rather than their names. This makes tracebacks rather hard to read though.
If you use the idiom that every macro creates a lambda to do its work in, then including a gensym facility in template
for local macro vars could work for a lot of situations -- and would memoize since it depends only on local stuff
Or not even a gensym: put the things you want to evaluate in the target env in inline function arguments.
So a hygienic R macro
stuff + syntactic_transform(other_stuff)
expands to
stuff + (function (x, y z) {
computation_in_macro_context(x, y, z)
})(x=<expression_in_target_env>, y=<expression_in_target_env>, z=<expression_in_target_env>)
Which should cover most of the hygiene use cases (lazy evaluation helps a lot here)
N.B. you want the function to be in the namespace where the macro is defined, so this isn't quite right
but it's a tricky idiom to write, needing explanation. Can it be abstracted?
Another possibility : wrap text of every arg in a literal closure that preserves the environment. Macro result evaled in env. of macro, while wrapped args recover env of body.
Any syntax picking apart that you want to do to the args themselves might makes them unhygeinic though
Would be super sweet with a way to preserve the wrapping when manipulating syntax. Perhaps wrap with a closure cloaked as S3 obj?
Simple test case: https://gist.github.com/manuel/2009588