elixir
elixir copied to clipboard
Improved Macro and Macro.Env functions for language servers
In order for language servers to mimic functionality found in the Elixir compiler, we want to expose more compiler facing functionality (as per https://github.com/elixir-lang/elixir/issues/12645#issuecomment-1954743398).
There are at least 5 functions necessary:
-
[x] Macro expansion - for non-qualified calls, we need to consider imports or local macros. For remote calls, they must have been required before. Currently
Macro.expand/2can perform this, but it will also perform tracing and additional functionality, which may not be desired. Language servers may also want to add special annotations to the AST so it can distinguish generated nodes from non-generated ones. We still need to explore which will be the best approach here, either a function inMacroorMacro.Env. -
[ ] Reading and writing variables - whenever a variable is defined, we must store it in
Macro.Env. Currently there is no API for adding variables toMacro.Env, only reading them. However, we also know that variables may be defined anywhere in a pattern. We could add a function that traverses a pattern and updatesMacro.Envwith all variables, but it is most likely that language servers need to traverse the patterns as well, to expand macros and collect variable definitions, so I'd say that providing a function such asMacro.Env.put_var/2is enough. Another topic in relation to variables is the variable context. For a variable of shape{name, meta, context}, the context is actuallyKeyword.get(meta, :counter, context). TodayMacro.Envalready expects the actual context but we may want to encapsulate that. Another topic is in relation to the variable version.Macro.Envdoes not have a field so we can properly version variables, we may need to change that (as they will likely play a role in the type system). -
[x] Requires - requires are the simplest to implement.
Macro.Env.store_require(env, meta, module)should be all that is necessary. Themetais required to handle the:definedannotation. Returns{:ok, env}or{:error, reason}. We usestore_instead ofput_since it returnsok/error. -
[x] Aliases - aliases are relatively straight-forward too. Two functions will be made available:
Macro.Env.store_alias(env, meta, module), where the alias is inferred, andMacro.Env.store_alias(env, meta, module, alias). Return{:ok, env}or{:error, reason}. -
[ ] Imports - imports are done as
Macro.Env.store_import(env, meta, module, opts \\ []). Returns{:ok, env}or{:error, reason}.
I can work on this, unless you planned on doing it yourself.
Thanks for putting this together 💪.
Also, might need an function for module attributes.
Module attributes are not part of the compiler. They are handled by the @ macro and it mostly uses public APIs from Module which you can leverage. The challenge will be exactly in compiling a module just enough so functionality like module attributes work, while storing AST information. One potential idea is to execute the code as usual, but replace some of the Kernel macros by LanguageServer.Kernel macros, which store additional information, and then fallback to @.
defmodule LS.Kernel do
defmacro defmodule(name, do: block) do
# augmented version of defmodule
# for example, you can augment the block by adding
# some lines that fetch relevant information from the
# module and then raises (so the module is not effectively
# defined).
end
defmacro @(expr) do
# preprocess and store relevant information
quote do
Kernel.@(unquote(expr))
end
end
end
The best way to collect this information is definitely open to debate and we may need new functionality. @lukaszsamson, how does elixir_sense handle this? Does it rely exclusively on pre-traversal of the AST? Or does it still execute the module body?
I can work on this, unless you planned on doing it yourself.
Thank you for the offer. I'd like to tackle this one, I will use it as an opportunity to refactor parts of the compiler. I will put it as high priority on my list. :)
One potential idea is to execute the code as usual, but replace some of the Kernel macros by LanguageServer.Kernel macros, which store additional information
That's a neat idea. Two questions here? Where those augmented macros should reside. elixir stdlib or LSPs?
how does elixir_sense handle this? Does it rely exclusively on pre-traversal of the AST? Or does it still execute the module body?
elixir_sense relies on AST traversal. It open scopes on defmodule, def etc calls in pre callback and closes them in post callback.
The only place where some code is actually executed is in use macro expander and that currently introduces as many problems at it solves. It ties AST traversal with compilation of required modules which leads to errors like
(ArgumentError) could not call Module.get_attribute/2 because the module X is already compiled
That's a neat idea. Two questions here? Where those augmented macros should reside. elixir stdlib or LSPs?
LSPs.
The only place where some code is actually executed is in use macro expander and that currently introduces as many problems at it solves. It ties AST traversal with compilation of required modules which leads to errors like
If you override @, you can implement your own reader and writer, and then they would work on use too.
Another option is to do something akin to this:
try do
Process.put(:elixir_module_ast_to_expand, module_body)
Module.create(name, quote do
LSP.expand_module()
end)
catch
:done ->
Process.delete(:elixir_module_ast_to_expand)
:ok
end
where LSP.expand_module() will be something like this:
def expand_module() do
body = Process.delete(:elixir_module_ast_to_expand)
# expand body as usual
throw :done
end
The issue with this approach, however, is that if the module is being compiled at the same time, you will get errors. So you may need to add locks around compilation (which can slow everything down). Maybe we can add a feature to Elixir to allow some sort of module preview without conflicts. I will explore this a bit.
Second question related to the first approach. How would that monkey patching work with already compiled modules? Let's say we have a module with use Ecto.Schema and Ecto.Schema is already compiled. How would we make the macros called by schema dispatch to the overloaded ones?
Let's envision next steps as well. Suppose I wanted to build a dedicated LSP for phoenix. I'd need to override Router macros like scope et al.
Maybe a generic tracer that intercepts macro expansion would be an alternative. With API like
def on_macro_expand(ast, env, ast_expanded)
You are right. The overrides would only work for the immediate macros, so it only has limited use. We should probably scratch it for now.
Maybe a generic tracer that intercepts macro expansion would be an alternative. With API like
Unfortunately this can be used to introduce global modification of Elixir programs, so it is a no-no. For now, let's assume we will continue to perform manually expansion, but let's provide Module.draft(name, fn -> ... end) so you can mirror more module functionality, such as module attributes. WDYT?
Hi everyone, I have done almost all functionality here, except for variable handling. I will postpone this until we have more use cases, especially because variable handling may change in the future as we better integrate with types.
I have also expanded the scope of this feature: I believe it can also be used by those who wants to build their own languages on top of Elixir. For example, Nx could be built on top of it instead of relying on the double-compile step it does today. To do this, however, we would need the variable handling AND an API to raise compiler errors with diagnostics (we already have one to emit warnings, via IO.warn/2).
This week I want to build a mini-compiler, which you can use to build the buffer environment and leave it here as a proof of concept. Then I will close this issue. :)
Amazing! I will try this out as soon as possible.
Here we go, this is a proof of concept showing how you can implement a mini-compiler for either Elixir (or even a sub-language) using the building blocks above: https://gist.github.com/josevalim/3007fdbc5d56d79f15adedf7821620f3
The example is focused on the language server use case and it has a lot of comments. There is a state variable which you can use to capture any AST information that you want. And it shows how you can intercept some macros (such as defmodule). You folks also now how to reach out to me if you have questions. Enjoy!
/cc @jonatanklosko @jackalcooper