arrow-meta icon indicating copy to clipboard operation
arrow-meta copied to clipboard

new code generation

Open drieks opened this issue 6 years ago • 8 comments

Hello Arrow Team,

I started a new AST-Parsing Library kotlinx.ast (https://github.com/kotlinx/ast) that can be used to read Kotlin source files into easy to use data classes (using antlr-kotlin and the official Kotlin language grammar).

I'm already using this library for code generation, in combination with kotlinpoet (https://github.com/square/kotlinpoet) to create the generated files.

Maybe it can be used in arrow-meta. Please let me know your requirements for code generation if you are interested in discussing this.

kotlinx.ast is currently JVM-only, but support for JavaScript and Native is planned. It should be easy to create a shared code generation module for arrow-meta that can be used inside a gradle plugin, in unit tests and even in a native code generation tool (using kotlin native).

@rachelcarmena , @raulraja: A starting point for discussing may be replacing kastree with kotlinx.ast (https://github.com/arrow-kt/arrow/tree/rr-meta-prototype-integration/modules/meta/arrow-meta-prototype/compiler-plugin/src/main/kotlin/kastree/ast)

drieks avatar Oct 21 '19 23:10 drieks

@drieks great library!. I was not aware that existed. We are currently using a modified version of kastree internally in meta because their mutable visitors were not designed for node replacement. We find an isomorphism between a psi elment and the kastree tree to mutate the kastree node with new sources that should be used instead of those of the node. That is our current usage of Kastree.

What are the advantages of using kotlinx-ast or how do you envision it could be integrated in Arrow Meta?

For compatibility reasons including support for IDE plugins Arrow meta uses PSI trees so we can benefit from the entire set of compiler tools and ide apis such as the OpenApi for plugins which works in terms of PSI.

We are not opposed to replacing anything in Meta but we'd like to understand what are the benefits of replacing our current AST. Our current AST is multiplatform because it works in the analysis phase prior to descriptor resolution and Meta also has access to IR and all codegen phases via a DSL to intercept and replace node in those trees.

raulraja avatar Oct 22 '19 11:10 raulraja

Hi @raulraja,

kotlinx.ast is a replacement for kastree. It is a new library and currently only parsing is supported. The difference is that kotlinx.ast is using antlr for parsing, not PSI. Therefor kotlinx.ast can be also used in Native and JavaScript Projects. But because you already benefit from PSI, there is no reason to use kotlinx.ast I think.

How do you implement multiplatform support? As far as I know, kastree.psi and the kotlin compiler itself is JVM only? Or do you mean that you can parse multi platform projects?

drieks avatar Oct 22 '19 20:10 drieks

The current Kotlin compiler has an intermediate IR format that is a tree structure for the different backends of JVM, JS, Native etc. We are betting on it to support multiplatform but it's not complete yet. Arrow Meta works at the PSI level because all existing tooling is based of PSI include the IDE which we share functions with our compiler plugins.

We currently expect this to happen at some point in the near feature when IR is completed:

Source -> PSI -> Descriptors -> IrCodeGen -> [JVM, JS, Native]

I'm happy to open a path for kotlinx.ast if there is an isomoprhism between a node and a KtElement. The entire IDE set of services work based on KtElements and the PSI tree so our public API should resemble what the Kotlin compiler and IDEA uses as model to use the vast set of tooling already existing for PSI. For example, the PsiViewer IDEA plugin that makes developing meta plugins extremely easy since you can find the root node as the compiler sees it visually and then the kind of node you are transforming it to on a tree rewrite.

What I find interesting about having control of parsing is being able to introduce changes to the grammar that are parsed valid and then I can rewrite into a PSI tree of compiler known ktElements so I can alter the language syntax without changes to the compiler.

For example I could introduce this currently invalid Kotlin:

remote fun distributed(): IO<Unit> = TODO()

and rewrite to what valid Kotlin could be

fun distributed(): IO<Unit> = clustered { ... }

Can we do that we kotlinx.ast? Is there an isomorphism between what it parses and a PSI element like the one I found in Kastree in the current implementation at:

https://github.com/arrow-kt/arrow/blob/rr-meta-prototype-integration/modules/meta/arrow-meta-prototype/compiler-plugin/src/main/kotlin/arrow/meta/quotes/Quote.kt#L246-L271

If we have the power to alter parsing at this level in by which we can turn new syntax into valid Kotlin before feeding it to the compiler then that would be huge for Arrow Meta

raulraja avatar Oct 23 '19 11:10 raulraja

It is possible to implement an isomorphism between PSI and kotlinx.ast. The library is written language independent, it is also possible to implement... maybe a Java AST parser. You can copy or extend the existing kotlin implementation and add other keywords like function modifiers.

There are currently two modules in kotlinx.ast:

  • common contains language independent code
  • kotlin is currently the only implemented language

If you want to implement the remote modifier from your example, you have to extend the ANTLR-Grammar: The modifier Rule in Line 755

modifier
    : (classModifier
    | memberModifier
    | visibilityModifier
    | functionModifier
    | propertyModifier
    | inheritanceModifier
    | parameterModifier
    | platformModifier) NL*
    ;

can be extended with a new line containing for example | arrowModifier.

But I'm not sure if this is the best way. Modifying the Syntax of a language can prevent other existing tools from working on these files. A well-known example from the C++-World is Qt, using "signal" and "slot". As far as I know, is there some preprocessor magic that allows the C++ compiler to compile the code and another (external) tool called moc to generate additional code. Of course, arrow could go a similar way, using a compiler plugin.

Lombok does exactly this for java, you can add an @Getter Annotation and the compiler plugin will generate the corresponding get-Function.

Instead of using a new modifier, you can try to use an annotation to trigger the compiler plugin:

@remote
fun distributed(): IO<Unit> = TODO()

Or even a keyword-like fun

fun distributed(): IO<Unit> = generated()
// implemented for external tools as
fun <T> generated<T>(): T = throw Exception

Also passing arguments is possible:

fun distributed(): IO<Unit> = generated {
 // some code
}
// implemented for external tools as
fun <T> generated(callback: () -> T): T = throw Exception

External tools can parse the file, it is 100% Kotlin and the compiler plugin can still generate some magic transformations.

Back to kotlinx.ast: it contains functionality

  • to parse a file/string into an AST
  • a very simple AST representation (Ast can be AstNode (with children) or a AstTerminal containing text) (see Ast.kt)
  • a way to transform an AST into another AST, basically
    • map
    • flatMap
    • flatten
    • filter defined for the Tree-Structure stored in Ast.

The official Kotlin-Grammar produce a very large and hard to use AST. If you want to have a look: testdata contains:

  • files named *.kt.txt contains the kotlin code to parse
  • files named *.raw.txt contains the raw ast

For example Class.kt.txt contains 23 lines, Class.raw.txt 533 lines. The generic Ast in file Class.summary.txt is much better to handle and contains 37 lines. Klass is a small collection of language independent data classes (extending the Ast interface). The most important type is KlassDeclaration. I would suggest implementing an isomorphism between PSI and KlassDeclaration.

I think a very good way to implement arrow.meta can be a generic AST-transformation tool with "Plugins" containing rewriting rules. This would allow other users of arrow.meta to do powerful meta programming in their own projects by just adding a custom rule to the compiler classpath. The rewriting rules for @optics can be in a separate jar:

dependencies {
    compile "io.arrow-kt:arrow-optics:$arrow_version"
    compile "io.arrow-kt:arrow-syntax:$arrow_version"
    kapt    "io.arrow-kt:arrow-meta:$arrow_version" // depends on arrow-meta-framework and arrow-meta-rules
    //kapt    "io.arrow-kt:arrow-meta-framework:$arrow_version" // contains the generic compiler plugin
    //kapt    "io.arrow-kt:arrow-meta-rules:$arrow_version" // contains the rules used by arrow
}

If somebody just wants to use meta programming:

dependencies {
    compile "io.arrow-kt:arrow-optics:$arrow_version"
    compile "io.arrow-kt:arrow-syntax:$arrow_version"
    kapt    "io.arrow-kt:arrow-meta-framework:$arrow_version" // contains the generic compiler plugin
    kapt    "com.example:custom-meta-rules:1.2.3 // contains third party transformation rules
}

What do you think about such a framework? It would at least match my code generation requirements perfectly. Please let me know if I can help in any way.

drieks avatar Oct 23 '19 20:10 drieks

I just found KEEP-87 / https://github.com/arrow-kt/kotlin/pull/6.

It is possible to extend kotlinx.ast to parse this syntax:

// Kotlin + KEEP-87
extension class WrapperSemigroup<A>(with val semigroup: Semigroup<A>) : Semigroup<Wrapper<A>> {
  override fun Wrapper<A>.combine(b: Wrapper<A>): Wrapper<A> = 
      Wrapper(this.value.combine(b.value))
}

and then convert it into this:

// Regular Kotlin 
class WrapperSemigroup<A>(val semigroup: Semigroup<A>) : Semigroup<Wrapper<A>> {
  override fun Wrapper<A>.combine(b: Wrapper<A>): Wrapper<A> = 
      with(semigroup) { Wrapper([email protected](b.value)) }
}

drieks avatar Oct 23 '19 21:10 drieks

@drieks If we are able to convert this AST to PSI, we could replace default KotlinParserDefinition with a custom one, extending grammar as we need, but being compatible with the rest of the compiler.

ShikaSD avatar Oct 23 '19 23:10 ShikaSD

Makes sense. Yes a feature like this would be great, we would need then figure out what needs to be changed in the IDE part to treat those as valid PSI elements

raulraja avatar Oct 27 '19 16:10 raulraja

Are you aware that Kotlin 1.4 is changing its intermediary representation? (in order to unify it for cross platform) Keywords: Klib, FIR https://blog.jetbrains.com/kotlin/2019/12/what-to-expect-in-kotlin-1-4-and-beyond/

LifeIsStrange avatar Mar 04 '20 18:03 LifeIsStrange