refactor-nrepl
refactor-nrepl copied to clipboard
Outline work needed for cljs and cljc support
I'd like to use this issue to start a discussion on what needs to be done in order to add support for cljs and cljc.
Most of what I consider to be the important functionality involves the analyzer and is unavailable outside of clj files.
I'm positive others would like to improve this state too, but for someone who's not familiar with the project trying to tackle these issues seems a bit hopeless.
It would be helpful to start outlining what the current problems are, what solutions could work and lay a path for people who are interested in working on this but don't know where to start.
Some questions I currently have in mind:
-
Why does the current approach cannot work for cljs? This is not specific to refactor-nrepl, but some pointers on how exactly does the analysis process differ between clj and cljs would make a good resource.
-
In principle, are all the existing middlewares applicable to a cljs ast? It is unclear if there are platform-specific blockers that would prevent this from happening.
-
What are some potential solutions for implementing the existing functionality in cljs? Some high level on how the cljs runtime could integrate with the rest of the middleware.
-
Are there any potential problems with having to deal with two distinct analysis results?
Cheers!
awesome you opened this. let me collect my thoughts and previous discussions around this before answering in detail
- What strategies could be used for testing?
How do we prevent regressions and assert that any given feature works in all three contexts? There seems to be a strategy in place already with
rest/resources
but it's not clear whether this would scale? Also the current model could use some documentation if there is indeed a method to it.
Just after a bit of analysis the list of refactor-nrepl
functions using ASTs and the list of cljr features using these refactor-nrepl
functions.
-
refactor-nrepl.find.find-locals
- promote fn
- extract function
- inline symbol
- find usages/rename symbol
- create-fn from example
- change fn signature
-
refactor-nrepl.find.find-symbol
- find local and global sybmol
- rename local and global symbol
-
refactor-nrepl.find.find-used-publics
- replace refer all
and now your questions
Why does the current approach cannot work for cljs? This is not specific to refactor-nrepl, but some pointers on how exactly does the analysis process differ between clj and cljs would make a good resource.
Essentially the current approach does not work for cljs because the analyzer we've choosen (clojure contrib analyzer: org.clojure/tools.analyzer) does not have a working cljs/js implementation. The existing cljs/js implementation has been abandoned due to lack of time to keep up with cljs, see https://github.com/clojure/tools.analyzer.js#this-project-has-been-abandoned.
So as simple as that we decided to "buy" an analyzer solution and build features on top of it expecting the cljs part working too which is not the case unfortunately.
There is one exception, clean-ns which uses its own means to gather information about the namespace it works on. It does not use the AST and supports clj, cljs and cljc.
This answers your 2nd question too I suppose. In the context of clojure.contrib analyzer there is no such thing as cljs AST at the moment unfortunately.
What are some potential solutions for implementing the existing functionality in cljs? Some high level on how the cljs runtime could integrate with the rest of the middleware.
In short find an other solution the get the necessary information about the clj/cljs code we work with. Apart from the missing cljs story of the currently used analyzer there are other problems with it. A good discussion of these problems by some much brighter ppl than myself can be found here: #134.
The above discussion touches on a load of things so perhaps a very careful thinking over all those points raised could be benefitial. Might raise new questions etc.
(I guess it would be also possible to pick up the maintenance of org.clojre/tools.analyzer.js
No idea how much work is that but I am guessing quite a big amount. Also because of the other problems of the tools.analyzer in the context of tooling I would not prefer this way.)
A bit about options on potential solutions
One option is to run our own minimalistic analyzer. clean-ns
does this in a very very limited sense. It collects all symbols in a file using tools.reader
, see details in the refactor-nrepl.find.symbols-in-file
namespace. We could go down this route and build our own minimal ASTs collecting only the info we need. This is in a way what cursive does.
As the cljs story is really getting more and more interesting these days we might have a look on self hosted cljs repls like lumo and planck if they could help us out for cljs.
Also joker looks very interesting. If I understand the impliciations there it should support both clj and cljs. Also as implemented in go
it does not need a running REPL. On the flip side tho there are limitations and it is mentioned in the project's non-goals section that feature parity with clojure is not in scope of the project. this may bite us later if we go down this way. we should also consider the fact that going this way would break the architecture compatibility between cider and clj-refactor/refactor-nrepl
. this could be a good thing in the sense if this works out that might be a way for cider to go as well in the future (talking about a really huge rewrite now). but also it would very likely defeat my goal of the past year of merging cljr gradually into cider.
disclaimer: although i am very excited to do a POC with joker i have very limited time nowadays to hack on these things unfortunately.
disclaimer: this is more like a vision like thing than anything else and mainly mine. other people in the team and around all things clojure in emacs may not agree with this last section.
i appreciate i might not answered all your questions but hopefully gave some food for thoughts. let me know what you think and if this in any way helpful or just even more confusing...
@benedekfazekas thanks fot the great response.
Please allow me to not address your points right away, because I realize there are some big gaps in my understanding of the various analyzer toolchains.
I was under the impression we'd have no choice but to use https://github.com/clojure/clojurescript/blob/master/src/main/clojure/cljs/analyzer/api.cljc or maybe https://github.com/clojure/clojurescript/blob/master/src/main/clojure/cljs/analyzer.cljc
I found an old post that does clarify some of the goals of tools.analyzer here and puts thnigs in perspective.
It is not clear to me whether using the cljs analyzer directly would be possible. I guess this is a downside of the data-driven approach right here: I cannot look at the refactor-nrepl codebase and easily determine what analyzer functionality it depends on. There is just one or two calls into the library but the rest of the project depends on specific ast keys.
I think what would go a long way towards determining the right path forward would be to have an exhaustive reference of the ast data that's actually used throughout the project. Then we could look at the outputs of different alternatives and easily determine what's different and/or missing. Am I correct that this would help?
If so, maybe a test namespace where each middleware would have to assert against an analysis pass checking for its dependencies, or maybe better, a set of specs for the ast?
Edit: I'm going off another assumption here, which is that with some undetermined amount of shims, we'd be able to build upon the clojurescript analyzer directly and recreate the ast that refactor-nrepl currently requires. I now realize sounds a lot like rewriting a subset of tools.analyzer.js, and without familiarity with those tools it's unclear how much work this represents.
I think what would go a long way towards determining the right path forward would be to have an exhaustive reference of the ast data that's actually used throughout the project. Then we could look at the outputs of different alternatives and easily determine what's different and/or missing. Am I correct that this would help?
I don't think so. The AST is the API. That was part of the reason in choosing tools.analyzer to begin with, that we could get a general AST (whether the source was clj, cljs or cljc) and then do work on that AST.
If that's no longer viable, we should look into replicating the capabilities afforded by tools.analyzer. @benedekfazekas listed those above.
I'm not aware of another library that does analysis, like tools.analyzer, but we might be able to get the same capabilities by running the cljs compiler.
This takes us to point 4. in your OP:
Are there any potential problems with having to deal with two distinct analysis results?
No, there isn't. And this is starting to look like the only way forward. The lack of progress is mostly because neither @benedekfazekas or myself write cljs in our day to day job and nobody else has shown any interest in helping us out (until now? :))
I don't think so. The AST is the API. That was part of the reason in choosing tools.analyzer to begin with, that we could get a general AST (whether the source was clj, cljs or cljc) and then do work on that AST.
Right, the point I was trying to make is: if I run a cljc file through both tools.analyzer.jvm
and clojurescript.analyzer
I'll get different asts even if the file has no reader conditionals. How different is the big question here, and I haven't checked yet, but I'm guessing substantially different given how much code there is in tools.analyzer.js
.
So imagine we were to diff those two asts and make fixing those discrepancies our todo list, we potentially could ignore a bunch of those if we limited ourselves to what refactor-nrep uses, the API would then become a subset of the tools.analyzer AST.
Does that make any sense or am I missing something?
I think you're right that the ASTs will be nothing alike, and that's fine. Even if it was possible to find common ground in the two ASTs and normalize them into some more generalized thing, I don't think that would be a good idea. We'd have to track changes in both projects to keep our normalization step in sync.
I think a much better approach would be to just add a new code path to e.g. find-symbol
which uses clojurescript.analyzer
, or something else, to find symbol occurrences.
If you want to talk in real-time, you can find us both on gitter. I'm around on freenode and I think @benedekfazekas still hangs out on slack (in the clojure channel)
So now, what about: https://github.com/clojure/jvm.tools.analyzer#usage-clojurescript?
The landscape is even more confusing than I initially though, thinking out loud here in hopes that you guys can correct me if I'm missing something:
-
tools.analyzer
is an umbrella project that was aiming to provide a consistent, platform agnostic ast. What I didn't know is that it seems to re-implement the compilation process of the target. Which actually would explain why I sometimes get weird analysis error messages for code that seems to compile just fine. -
tools.analyzer.jvm
is the clojure backendrefactor-nrepl
currently uses. -
tools.analyzer.js
would have been our best option if it were maintained since it's another backend fortools.analyzer
. However it's no wonder it's been abandoned if they had to re-implement the compiler given how fast clojurescript changes. -
clojurescript.analyzer
is a possible integration target, though it will require multiplying code paths inrefactor-nrepl
and might be difficult to support across clojurescript version? -
jvm.tools.analyzer
is yet another analyzer project that we haven't mentioned yet. Despite its name it seems to also support cljs and uses both the regular clojure compiler andclojurescript.analyzer
.
At first glance jvm.tools.analyzer
seems interesting since it's the only project that could give us a unified api? There are various gotchas listed in the readme, like the analysis causing code to be eval'd, which I think relates to #134 mentioned earlier, and I'm not sure how different its output is from tools.analyzer.jvm
.
Your concerns for clojurescript.analyzer
would be true for jvm.tools.analyzer
as well, since that's just wrapping clojurescript.analyzer
in addition to the jvm clojure compiler.
For that reason, we might as well use jvm.tools.analyzer
to access clojurescript.analyzer
.
If you want to get involved @julienfantin, I suggest the following route forward:
- Write new code to replicate the current clj functionality for cljs, using
jvm.tools.analyzer
. - Once we're feature complete, investigate what it would take to use the newly written code to replace the old code (differences in the ASTs, caching / other performance concerns)
- Throw out
tools.analyzer
.
I don't think we should care about duplicating code while investigating. I'd rather have some extra stuff, than break the currently working clj features while adding cljs support.
It's also too early to worry about supporting multiple versions of cljs. Tracking the latest major release is miles better than no support at all. Most likely the cljs analyzer isn't moving very fast, even if cljs itself is!
Ok that sounds like a plan!
I do have a couple of remaining questions and a comment:
-
I'm not familiar with the project's architecture and design so when you talk about duplicating code, I'm not sure what you have in mind and where this duplication would start. Would it be entirely on the clojure side, say something like modifying an operation's nrepl message handler to branch between calling one, the other, or both analysis processes depending on the file-type? Or did you mean defining separate refactorings all the way to the emacs client side?
-
Something we haven't discussed yet, but that sounds like a good first step to me, would be to ensure that we can make a jvm analysis pass on cljc files while ignoring all the cljs reader conditionals. This is certainly not something we would want to release since it could cause all sorts of issues post-refactoring (like stale symbols in cljs files and so on), but it'd help make refactor-nrepl useful for people working in cljc codebases and willing to get involved with an in-progress fork/PR. Are there any reasons besides the one I mentioned why we wouldn't want to do that? If not could you expand on what would need to happen for this to work?
Lastly I'll say that I have very limited yak-shaving time over the next few months, so progress on this will be slow.
- Let's say you want to start with adding support for cljs for the refactorings that use
find-used-locals
. This would enable quite a few refactorings (see the list earlier in the thread). You'd basically branch here which checks if the file is a cljs file, and create two new functions:
- One containing the existing clojure code, which we're leaving completely unchanged
- A new function which contains all sorts of wonderful and magical new code to make this all work for cljs, using
jvm.tools.analyzer
.
- This sounds like a valuable step to do during research / early implementation. We've been using snapshots to dogfood new features, so as soon as something is in a reasonable state, with some benefit, releasing it to early adopters is cool.
@benedekfazekas is planning to cut a stable release sometime later this month, which makes it easy for people to opt-out of any alpha-quality features that are coming to cljs.
Lastly I'll say that I have very limited yak-shaving time over the next few months, so progress on this will be slow.
Slow is infinitely better than nothing!
And there's the option of writing a simplified portable parser - something that won't really try to expand macros and so on. On having more of the refactorings depend on just the loaded code instead of analyzing all the source code.
In general, there's also the bigger question of how to better support ClojureScript in CIDER as well. Most of the limitations come from the stagnation in nREPL's development - in theory if first call support for ClojureScript was added there instead of relying on piggieback a lot of things would become simpler. Also in theory - supporting some evaluation backend that nREPL would simplify the ClojureScript situation a lot. But that's also a lot of work that no one really wants/has the time to tackle.
I like that fact that there is a plan formulating how attack this. and agree with @expez that find-used-locals
is a nice candidate for POCs. are you guys sure about jvm.tools.analyzer
tho? it seems to be an inactive and out of date project to me.. last commit with some real content is 8 months old... am I missing something?
Guess you can always ping the maintainer about its state.
also there are several self hosted cljs options now. they might worth a check.
it seems to be an inactive and out of date project to me.. last commit with some real content is 8 months old... am I missing something?
It's just a wrapper around the clojure and cljs compiler, so unless the API of those two change, the wrapper doesn't have to be updated.
ah right. thanks for clarification @expez
Unfortunately haven't had time to work on this at all, but I did gain some insight on what will need to happen, so I'll leave some notes here, if not only for myself:
Add support for end-column
and end-line
in the ClojureScript analyzer
The ClojureScript analyzer currently doesn't include end-line
and end-column
in the AST's :env
, but refactor-nrepl
's own analyzer namespace relies on that information, notably for node-at-loc?.
I don't know that there is a way to replicate the existing logic around node-at-loc?
without being able to bound an ast node with a buffer position, and even if there were it would probably be non-trivial and prevent reusing analyzer logic between clj and cljs.
There is a simpler solution I think, which consist in writing a simple patch for the Clojurescript analyzer. This patch will need to replicate getting the line data from the environment for end-line
and end-column
and should take care of updating all the call sites (about a dozen or so) to propagate that information throughout the AST.
I mentioned this on the cljs-dev
slack channel and it seems like it would be fairly easy to get this merged.
Use clojure.tools.reader.reader-types/indexing-push-back-reader
jvm.tools.analyzer
uses a vanilla pusback reader, which I don't think would get us the end-line
and end-column
metadata that the patch described above would use. Fortunately the analyze-ns
function is quite small and easy to change for that purpose.
It is not clear at this point whether there is anything else in that project that we would use and would warrant patching the project rather than rolling our own wrapper for the cljs analyzer.
ClojureScript patch submitted: https://dev.clojure.org/jira/browse/CLJS-2051
awesome!
also there are several self hosted cljs options now. they might worth a check.
I have to say that when I was wondering how to do this as well I basically ended up considering the above the easiest approach. In particular, Antonio Monteiro has already ported the ClojureScript analyzer to self-host (node) and this would be a very valuable library to spin off.
I might be the only one here, but I think, and I am really trying hard in orchard
, that a better approach would be to have something that works with or without the JVM so that eventually it could be used for static analysis and therefore compete with Cursive in terms of functionality like jump to def and other things without waiting n seconds for a REPL. Purely personal battle I know :smile:
Spinning off a new analyzer library has so many upsides that I am considering doing it myself...maybe merging code from tools.analyzer.js
as well. However, who knows when I will have time :clock1:
I have also looked into https://github.com/thunknyc/crntl and I had an old idea of using universal-ctags
for static analysis -> https://github.com/thunknyc/crntl/issues/9. All well and good but no POC so far :disappointed:.
By the way this is a great discussion so thanks @julienfantin!
The conforming work of the ClojureScript analyzer AST to the tools.analyzer spec which shipped as part of ClojureScript 1.10.439 doesn't help here?
https://dev.clojure.org/jira/browse/CLJS-1461 https://github.com/clojure/clojurescript/blob/master/changes.md#110439
The conforming work of the ClojureScript analyzer AST to the tools.analyzer spec which shipped as part of ClojureScript 1.10.439 doesn't help here?
Think that patch was news to all of us :) It should be largely plug and play, I think!
yeah sounds like it. good shout @kommen. we discussed this with @vemv in the #cider room on clj slack.
while playing with this, realised that https://dev.clojure.org/jira/browse/CLJS-2051 actually never got merged
after working around the missing end-line
, end-column
information the next hurdle seems to be that the stages of macro expansion are not persisted in the cljs AST, there is no raw-forms
tag. the code expects this tag to be present to be able to look at the original form in case macro expansion happened.
see POC for find-locals
support cljs on cljs-support
branch. thoughts?
I like the POC :smile:! Also that piece of functionality could be a very good candidate for the orchard
!
@arichiardi Although maybe we can achieve pretty much the same result using rewrite-clj
and just ignoring some of the complexity that goes alongside tools.jvm.analyzer
and ClojureScript's own analyzer. Looking at projects like clj-kondo it seems you can go pretty far with rewrite-clj
and without having to evaluate any code to build the AST.