gradle-clojure icon indicating copy to clipboard operation
gradle-clojure copied to clipboard

Added support for multiple source sets (issue #11) and basic support for Gradle incremental tasks (issue #10)

Open pbzdyl opened this issue 8 years ago • 17 comments

When Gradle detects there were no changes to the input files, it won't call Clojure compile task at all.

pbzdyl avatar Nov 24 '16 10:11 pbzdyl

Some more details regarding the implementation:

Gradle Java plugin by default defines main and test source sets and configures their defaults with the following values:

main {
  java.srcDirs = ['src/main/java']
  output {
    classes = "$buildDir/classes/main"
    resources = "$buildDir/resources/main"
  }
  compileClasspath = configuration.compile
  runtimeClasspath = compileClasspath + main.output
}

test {
  java.srcDirs = ['src/test/java']
  output {
    classes = "$buildDir/classes/test"
    resources = "$buildDir/resources/test"
  }
  compileClasspath = configurations.testCompile + configurations.compile + main.output
  runtimeClasspath = compileClasspath + test.output
}

Clojure plugin enhances the convention by adding following defaults:

main {
  clojure.srcDirs = ['src/main/clojure']
}

test {
  clojure.srcDirs = ['src/test/clojure']
}

Clojure plugin also uses classpaths configured by Java plugin convention for compilation and runnings test:

  • compileClojure will use main.compileClasspath + main.clojure.srcDirs to setup Clojure compile process classpath
  • compileTestClojure will use test.compileClasspath + test.clojure.srcDirs to setup Clojure compile process classpath
  • testClojure will use test.runtimeClasspath for running tests (it will include all compile and testCompile dependencies as well as main and test output files)

This setup has the following consequences:

  • incremental changes detection will work correctly (both in Java and Clojure plugins)
  • with the standard set of source sets (just main and test) Clojure code under src/(main|test)/clojure won't see Java classes from src/(main|test)/java
  • to access Java classes in Clojure code (or vice versa) user has to put them into separate source sets and configure their classpaths accordingly (I will add examples to the README once we agree on this solution)

pbzdyl avatar Nov 28 '16 10:11 pbzdyl

I haven't tested this thoroughly yet, but wanted to get back to you--sorry it has taken so long!

I believe this goes a long way towards solving #11. My basic experiments are showing that it's working, but I need to try something a little more complicated before saying it's working well.

jszakmeister avatar Jan 10 '17 12:01 jszakmeister

Oh, and the incremental compilation is nice--in that it doesn't re-compile if it doesn't have to. However, it's not clear to me whether it's actually doing it incrementally (only compiling the files that changed). Is there a way to see what Gradle is doing under the hood?

jszakmeister avatar Jan 10 '17 12:01 jszakmeister

@jszakmeister Thank you for taking time to test my changes.

Incremental compilation in gradle-clojure plugin works in a very basic mode: if anything changed in input sources (e.g. dependencies, source set's source files etc.) all source set's sources will be recompiled (as it would without my change). But if there are no changes detected by Gradle, then it won't call Clojure compile task at all.

Implementing smart incremental recompilation would be much more complicated (detecting which generated class files should be deleted, which source files should be recompiled due to their changes as well as inputs they depend on etc.) and if really needed can be added later.

pbzdyl avatar Jan 10 '17 13:01 pbzdyl

@pbzdyl That makes sense. Though don't we have the class file problem now? Or is the area snapshotted and then compared to after compilation occurs to know what to delete before re-compiling? Or does it just re-compile the files and there may be cruft in the class files area? Let's say I removed a .clj file and re-compiled (without cleaning), would the .class file associated with the .clj be in the class files area still or is Gradle doing something here to know that it needed to be removed?

Either way, having it not recompile Clojure files when they haven't changed is useful.

jszakmeister avatar Jan 10 '17 13:01 jszakmeister

@jszakmeister When a source file is removed Gradle should detect it as a change in input parameters to a compile task and call it. Clojure compile task always cleans its output directory before compilation so it will always generate consistent set of class files.

pbzdyl avatar Jan 10 '17 13:01 pbzdyl

@pbzdyl Yeah, so that's my concern. If I'm compiling both Java and Clojure in the same source set, the Clojure compilation task is going to delete the class files from the Java compilation. I don't think this a change you introduced--I think the issue has been there all along.

So here's some results of my testing.

Without your changes

  • Java class files are removed when compiling Clojure sources. This seems bad. You end up with a ClassNotFoundException when compiling a Clojure file that actually depends on one of those Java files though, and there's no way to break the cycle.
  • Removing a Java source file results in the class file being removed. I'm not sure how this happens, but it does (which is really nice!)

With your changes

  • gradle compileJava followed by gradle compileClojure from a clean build still shows the Java files being removed. Doing the same sequence again (without cleaning) will finally result in the a full set of files. You end up with a ClassNotFoundException when compiling a Clojure file that actually depends on one of those Java files though, and there's no way to break the cycle.
  • Removing a Java source files causes the resultant class file to be removed.
  • Removing a Clojure source causes all the Java class files to be removed when re-compiling the Clojure code.

Summary

I think gradle-clojure has some real issues here (with or without your changes) and I'm not sure we can ignore the problem. :-( I definitely goes beyond your new additions though.

jszakmeister avatar Jan 10 '17 20:01 jszakmeister

Thank you @jszakmeister for so thorough testing.

Output directory cleaning

The code cleaning the output directory has been added in this PR and can be removed. I wasn't 100% sure if this is a good idea but I thought it's better to start with a clean state during compilation but as the output directory is shared by many compilation tasks supporting source sets it turns out it is not safe. I will remove cleaning from the compile task. If one would like to make sure there are no obsolete class files generated from removed source files (e.g. *.java or *.clj), then one has to run clean task first.

May I ask you to test again with destinationDir.deleteRecursively() removed from the compile task?

Mixed Java and Clojure source sets

I am not sure what your source set configuration looks like but this PR requires to have Java and Clojure sources to be in separate source sets if they depend on the other.

Current version of Clojure compile task adds the output directory to its input classpath for compilation but it breaks the incremental compilation as each compilation automatically invalidates inputs of the Clojure compile task and will never work and thus I had to remove the Clojure compile task's output directory from its very own input classpath.

Unfortunately, Clojure compiler (clojure.core/compile) doesn't support joint Java+Clojure compilation (unlike groovyc or scalac; I think a similar issue exists for Kotlin) and sources of each language need to be compiled separately in Gradle.

So in order to get support for both mixed Java and Clojure code Gradle projects and incremental compilation you need to separate your Java and Clojure code into separate source sets if one depends on another.

Example configuration if your Clojure code depends on Java classes:

sourceSets {
    ss1 {
        java { src = "src/ss1/java" }
    }

    ss2 {
        clojure {
            src = "src/ss2/clojure"
            compileClasspath += ss1.output
            runtimeClasspath += ss1.output
        }
    }
}

Another one for Java sources depending on classes generated from Clojure sources:

sourceSets {
    ss1 {
        clojure { src = "src/ss1/clojure" }
    }

    ss2 {
        clojure {
            src = "src/ss2/java"
            compileClasspath += ss1.output
            runtimeClasspath += ss1.output
        }
    }
}

It will also by design forbid source dependency cycles.

pbzdyl avatar Jan 10 '17 21:01 pbzdyl

Thank you @jszakmeister for so thorough testing.

No problem. I'm happy to help where I can. :-)

Output directory cleaning

The code cleaning the output directory has been added in this PR and can be removed. I wasn't 100% sure if this is a good idea but I thought it's better to start with a clean state during compilation but as the output directory is shared by many compilation tasks supporting source sets it turns out it is not safe. I will remove cleaning from the compile task. If one would like to make sure there are no obsolete class files generated from removed source files (e.g. *.java or *.clj), then one has to run clean task first.

May I ask you to test again with destinationDir.deleteRecursively() removed from the compile task?

So the issue with doing this is that when you delete a source the class file may still be hanging around, and what's in the build tree is not the same as what you'd get if you had started with a clean build. It leads to the Ant development cycle where you always run a clean step before the build and you're building everything all the time. I don't want to be in that place. :-)

Mixed Java and Clojure source sets

I am not sure what your source set configuration looks like but this PR requires to have Java and Clojure sources to be in separate source sets if they depend on the other.

I guess the issue here is two-fold:

  1. Whether or not you depend on the Java sources, the Clojure compiler will currently delete all the class files from the Java build.

  2. There's no enforcement of the policy, which will put users in a situation where they're confused about what is happening and how.

Do we know if there is a template Java-based language plugin in Gradle somewhere that we could draw from to get all these features in place? It'd be nice to look at something more stripped down and has the hooks to support all the necessary bits.

One strategy would be to compile each Clojure file in a temporary directory and record what was built for later use, and then move the files over to the actual build tree. Then when the source file disappears, we could clean the appropriate class files from the disk--assuming we can persist this information in the build tree somewhere for later consumption in the next build run. This would allow the two to coincide nicely.

I really wish I understood more about Gradle's model because I feel like there's probably a reasonable solution, but I don't know what to aim for. :-(

BTW, don't take any of this as a knock on you. I really appreciate you taking the time and effort to come up with a reasonable solution. It's more a lack of knowledge on my part, and my frustration that it's not easier to find what I'm looking for. :-)

jszakmeister avatar Jan 10 '17 23:01 jszakmeister

@jszakmeister Thanks for your great feedback and suggestions. I really appreciate it 👍

I have implemented a bit smarter clean up of files produced by Clojure compile task (please, take a look at 2c5e8f359f959ca1102753b06fc34f14e7e6a09f and code which handles changed input files). I hope they won't have the issues you are concerned about as they should always remove class files matching only modified or removed Clojure source files.

I have also made an attempt to implement at least some integration tests to check if the basic features work - it should be easier to introduce changes in future.

I didn't have time to cover multiple source sets or mixed Java/Clojure projects but I will work on them later this week.

pbzdyl avatar Jan 11 '17 14:01 pbzdyl

@pbzdyl You're very welcome! I will try and take a look at this over the next couple of days. I'm pretty swamped right now, but I'd love to see the situation improved. Thanks for taking the time to make things better!

jszakmeister avatar Jan 12 '17 12:01 jszakmeister

I ended up with a few cycles today. This is a really good improvement! The only issue I found (and I'm not sure how to solve) is that when I have a Clojure file that imports a Java class and I remove the class, the Clojure compile doesn't realize it went missing and leaves the compiled Clojure class in place. I'm not what Gradle is really designed to do in that department though--it's definitely a bit of an edge case. Either way, this is a really good and useful step in the right direction.

+1 from me.

jszakmeister avatar Jan 12 '17 20:01 jszakmeister

@jszakmeister I have changed the mechanism for tracking obsolete class files (ba4d4731b4743e8a72d8db75d1bfbe1104f0b9d6).

I have also added a test case for your scenario where you modify/delete Java source file that Clojure source set depends on so it is always correctly recompiled.

I think it is now correct and complete. I will add documentation when @cursive-ide gives a green light for this solution.

pbzdyl avatar Jan 13 '17 08:01 pbzdyl

So I finally had some time to look at this, my apologies for the very long wait.

This looks like a good fix for #11, and it's similar to what the Kotlin plugin does.

The incremental stuff is trickier. I like the idea of compiling Clojure to a separate output directory, and that's what Kotlin does as well. They actually create the output syncing as a separate task and add it as a dependency - I'm not sure if there are advantages to doing so. They create the temporary output dir here, it's probably worth creating ours in the same way. In particular, they separate the output by source set name and also use kotlin-classes rather than just classes, which seems like a good idea.

Is my understanding correct, that the latest version here does not require Clojure and Java to be in a separate source set? Kotlin, for example, does not require this and I think it would be strange to require users to do so. Organising the compilation order is tricky though, and generally the assumption is that dependencies only go one way (i.e. Java can use classes from Clojure or vice versa, but not both). See here for some discussion.

True incremental compilation (i.e. detecting when a namespace needs to be recompiled) is harder but I believe not impossible. My plan was to create a dependency between the namespace initialisation class (my_ns.core__init.class) and the namespace source, or in the case where the user is not using AOT to create the dependency between the source file and its copy in the output. This would at least allow recompilation when the source files have been touched, but I'm not sure how to handle dependencies like the one that John pointed out - Clojure code should depend in some way on the Java classes it uses, and I don't know how that should work.

cursive-ide avatar Feb 15 '17 04:02 cursive-ide

Thank you for taking time to review.

Separate temp directories per source set and their name

I think this is a great idea and I will implement it.

gradle sync task to synchronise compiled classes in temp directories with source set's output/destination dir

My code is doing logically the same:

  • before compilation for each file in a temp directory it tries to delete its copy in the destination dir
  • after compilation it copies all files from the temp directory to the destination directory

I could change it to setting up a gradle task but I think those existing few lines of code are easier to understand and reason about.

Mixed Java/Clojure source sets

The latest version does not require Clojure and Java to be in separate source sets as long as they don't depend on each other 😄

Clojure and Java in one source sets with dependencies between is tricky them as Clojure compiler doesn't support joint compilation natively. I also think much of Kotlin plugin's complexity is caused by this feature.

One issue is incremental compilation. If Clojure depends on Java compiled classes (compiled to ${project.buildDir}/classes/${sourceSet}) it would need to be added to Clojure compile classpath. But it's also the destination directory for Clojure compiled classes. This means that Clojure compile will always invalidate UP-TO-DATE state for itself: Gradle will take a snapshot of Clojure compile input properties (including its destination directory as it's on its classpath and initiall contains only Java compiled classes) and during next snapshot won't be up to date with the new state of the destination directory now containing also Clojure compile classes.

Another major issue is the complexity of solving it by creating internally additional tasks and dependencies between them (I think this is how Kotlin plugin is tackling this problem). I would say it's not impossible but it's not easy too (at least for me). Thus I would like to start with something simple (e.g. support for projects with Clojure/Java mixed sources in separate source sets if they have dependencies between them and incremental compilation working) and then incrementally add support for example for a true incremental compilation or mixed sources in a single source set.

Or maybe someone else would contribute a more sophisticated solution without separate source sets limitation?

pbzdyl avatar Feb 17 '17 13:02 pbzdyl

@pbzdyl Just to let you know that I haven't forgotten about this. I'm hoping to be able to speak to some of the Gradle developers soon, and hopefully they'll be able to give us some guidance on the separate source sets issue. Once we have a clear plan for that we can adapt this change according to their recommendations.

cursive-ide avatar Apr 08 '17 02:04 cursive-ide

@cursive-ide Sounds good! Thank you.

pbzdyl avatar Apr 13 '17 08:04 pbzdyl