semantic icon indicating copy to clipboard operation
semantic copied to clipboard

Clojure(Script) support

Open emlyn opened this issue 6 years ago • 13 comments

Any possibility of adding Clojure/ClojureScript support?

emlyn avatar Aug 03 '19 00:08 emlyn

Clojure & ClojureScript aren’t at the top of our list right now, but we’re working on making it easier for third parties to add support for new languages to semantic. A good place to start would be by defining a tree-sitter parser for Clojure/ClojureScript; see for example the docs on adding support for new languages and the grammar development guide.

robrix avatar Aug 03 '19 15:08 robrix

Thanks for the response. I've had a look around and found a couple of implementations here and here. I suppose the next step would be to add one of these to haskell-tree-sitter?

emlyn avatar Aug 05 '19 13:08 emlyn

Yep!

robrix avatar Aug 09 '19 13:08 robrix

Since Clojure has macros, how will this work? It seems to me that for any language with macros that allow creation of new syntactic constructs (any Lisp dialect, Julia, Rust (?)) it would be far easier to use a client / server architecture (like I believe Kythe has) so that the AST generators that have already been written for those languages can be leveraged, rather than trying to write an incomplete version in Haskell.

Am I wrong here? I'm kind of curious because as a Dylan enthusiast I think it would be awesome to have this feature for our GitHub-hosted code. I decided to piggyback on this bug rather than ask separately for Dylan.

cgay avatar Nov 08 '19 20:11 cgay

@cgay It is as of yet an open question as to how we’ll deal with languages that support truly arbitrary rewritings of their syntax (e.g. C with the preprocessor). Whatever we do, we’ll end up aiming for the happy path—we may not be able to understand every possible macro invocation, but we should at the very least be able to handle most real-world code, especially given that tree-sitter parsers continue parsing even in the presence of syntax errors. Note that many languages with macros don’t require special knowledge in their parser, e.g. tree-sitter-rust.

Our roadmap going forward is not to write Haskell code to parse language grammars, but to generate per-language Haskell AST declarations from tree-sitter grammars. A client-server model a la Kythe is not desirable for architectural and legacy reasons: an approach with tree-sitter as the lingua franca is a lower maintenance burden than trying to corral N different language servers into a common vocabulary. See here and here for more discussion on why we’ve chosen a monolithic architecture.

patrickt avatar Nov 09 '19 05:11 patrickt

For Common Lisp I believe it's literally impossible to do a perfect job finding cross references unless you are the Common Lisp compiler, since arbitrary code can be run to generate new code at macro-expansion time.

Dylan (and I believe Scheme) is a little more amenable since its macro system is just a pattern matcher, but I think for Dylan you'd need to basically re-implement the macro expander [edit: and the module system] in Haskell to figure out what definitions are being referenced in the generated code.

cgay avatar Nov 09 '19 05:11 cgay

Can https://github.com/Engelberg/instaparse help?

dijonkitchen avatar Nov 12 '19 17:11 dijonkitchen

For Common Lisp I believe it's literally impossible to do a perfect job finding cross references unless you are the Common Lisp compiler, since arbitrary code can be run to generate new code at macro-expansion time.

Yes, this is (an example of) why semantic does program analysis.

robrix avatar Dec 06 '19 17:12 robrix

Yes, this is (an example of) why semantic does program analysis.

?

[edit] To clarify, I don't see how that addresses the problem I'm describing.

cgay avatar Dec 07 '19 17:12 cgay

We can do the same things as the compiler, including evaluating macros (subject to certain approximations).

robrix avatar Dec 09 '19 21:12 robrix

And when the macro uses data that's only available at run-time in order to generate the code, or generates different code for different Common Lisp implementations? :-)

Anyway, hopefully your approach will work for more popular languages like Rust, Clojure, and Julia, which also have robust syntax extension mechanisms.

cgay avatar Dec 10 '19 01:12 cgay

And when the macro uses data that's only available at run-time in order to generate the code, or generates different code for different Common Lisp implementations? :-)

That’s when the approximations become relevant: we’ll generate a set of results.

robrix avatar Dec 10 '19 21:12 robrix

Note that meanwhile a static analyzer for Clojure appeared on the scene:

https://github.com/clj-kondo/clj-kondo/blob/master/analysis/README.md

This analyzer is used by tools like clojure-lsp to provide navigation and refactoring. This tool is available as a standalone command line utility as well (or can be compiled into a native library if that is useful from Haskell, but you might as well just shell out to it). It's also available as a JVM library.

borkdude avatar May 02 '21 16:05 borkdude