closure-compiler icon indicating copy to clipboard operation
closure-compiler copied to clipboard

unshaded jar contains Apache Ant, GSON, etc.

Open frigus02 opened this issue 2 years ago • 5 comments

Reported by Dominic in https://groups.google.com/g/closure-compiler-discuss/c/5ofdcVdWgQs

The unshaded jar should not contain copies of dependencies. Instead it should use dependencies from Maven. Copying the important details here:

These commands will show you some of the copies included in the latest release:

jar tf ~/.m2/repository/com/google/javascript/closure-compiler-unshaded/v20211107/closure-compiler-unshaded-v20211107.jar | grep 'apache' jar tf ~/.m2/repository/com/google/javascript/closure-compiler-unshaded/v20211107/closure-compiler-unshaded-v20211107.jar | grep 'gson'

closure-compiler-unshaded-v20200927.jar was the first release that includes the apache dependencies in the unshaded jar. The gson dependencies have existed since even before that.

The commits in between v20200920 and v20200927 are https://github.com/google/closure-compiler/compare/v20200920...v20200927

Perhaps commit https://github.com/google/closure-compiler/commit/c9f3d327be6146a7d4de1f3d7c03feb36f49524a introduced this problem?

frigus02 avatar Dec 03 '21 09:12 frigus02

It looks like you're right, c9f3d32 caused this in two ways:

  • The unshaded pom used to (correctly) inherit dependencies from pom-main.xml, but now it ends up with no dependencies at all. This is not obviously broken since...

  • The resulting "unshaded" jar actually seems to be the _deploy.jar from the compiler_unshaded java_binary rule, and unless I'm very much mistaken, _deploy.jar output is always shaded. This is explicitly documented at https://docs.bazel.build/versions/main/be/java.html#java_binary:

    The deploy jar contains all the classes that would be found by a classloader that searched the classpath from the binary's wrapper script from beginning to end. It also contains the native libraries needed for dependencies. These are automatically loaded into the JVM at runtime.

    This in turn means that the jar_jar rule to shade the _deploy.jar is just relocating already-shaded jars, rather than combining them. (It is possible that this could be hiding bugs by allowing multiple jars to overwrite each other's content, it might make sense to apply the jar_jar rule on the external dependencies directly instead of the _deploy.jar, but I'm not actually aware of such a bug).

Proposed fix would first be to first restore the removed dependencies, though instead of putting them in a parent pom, but them directly in the unshaded pom, and then replace the java_binary rule with something else that produces a jar which only contains closure-compiler contents. The pom contents as of that version are no longer correct, at least guava and protobuf-java are out of date it appears.

My bazel knowledge is weak, but I can try to find/propose a rule that would only merge the types/resources declared (or generated) by this project. It looks like oss_java_library might be built for this sort of a thing, though the only usage of it appears to produce a jar that is missing at least typedast resources, polyfills.txt, runtime_type_check.js.

niloc132 avatar Mar 31 '22 16:03 niloc132

Here's a draft patch that produces a single un-shaded, un-ubered jar that only includes the java content of closure-compiler (and its various resources, like the externs zip and runtime_libs typedast) https://github.com/google/closure-compiler/compare/master...niloc132:3896-unshaded-jar

The basic idea is to use rules_jvm_external's java_export rule to mark a given java_library as an artifact for export to maven. This rule will find any dependencies that were originally fetched from maven (by looking at the maven_coordinates tags attached to their java_library), and exclude them from the output jar, plus also populate a pom.xml file template with those dependencies.

There are a few caveats for the patch as it stands:

  • The java_export for compiler_unshaded must have its maven_coordinates specified directly, as it wants to take the generated java_library rule it emits with a version. The idea seems to be that a project could specify multiple java_export rules, and those should depend on each other correctly (which includes knowing the version). This approach is somewhat incompatible with closure-compiler's existing sonatype_artifact_bundle plus deploy_sonatype_snapshot_bundles.js, which expects to be able to template the poms separately, after the jars are created. I believe that this code could be simplified, but hesitate to suggest it since the current system clearly works.
  • The java_export rule depends on one or more java_librarys, which means that the MANIFEST.MF does not have its Main-Class attribute, so the generated jar cannot be launched directly. With that said, I'm not sure this is actually a problem - given that the dependencies are not supposed to be shaded into the jar, it won't be possible anyway to call java -jar /path/to/closure-compiler-unshaded.jar` - the rest of the classpath will need to be specified, which prevents the Main-Class from being read.
  • In order to correctly mark protobuf-java as a maven dependency, it was necessary to swap out the @com_google_protobuf//:protobuf_java dependency for @maven//:com_google_protobuf_protobuf_java. Finding the original source of @com_google_protobuf was difficult, but it appears to come from google/bazel_common's workspace_defs.bzl, which confusingly (to me) declares an artifact for protobuf-java from maven with version 3.7.0, but then sets up an http_archive for 3.19.3. (Actually, it sets up two, with different names, but the exact same contents.) That http_archive rule has no maven_coordinates, so java_export doesn't understand that the contents should be treated as external. To fix this, I made the dependency change described above, so that the generated jar and templated pom depend on the expected protobuf-java version.

I've done some minimal testing on this from a project which depends on closure-compiler as a library, but still should do more testing (to verify that the protobuf-java-util dependency isn't required, but I suspect it is...). I've taken this time to pause my work to seek at least an initial review, if this approach would be of interest as a contribution.

--

Following the documentation of https://github.com/bazelbuild/rules_jvm_external/#publishing-to-external-repositories, it seemed that java_export was the correct rule for this job, but having gone through the sources of the project, it might make more sense to use maven_project_jar and possibly swap in maven_publish to actually perform the uploads. With that said, I can say that the current approach seems to work already, so I won't increase the scope of the patch unless it seems like a good idea to the project members.

niloc132 avatar Apr 07 '22 15:04 niloc132

@niloc132 I've just read through your draft change diff.

My primary concern is that we had a design goal to stop depending on maven for building, but this seems to be putting that dependency back in.

Still, this may be OK. To some extent our main goals with the OSS build are (off the top of my head, so likely not exhaustive):

  1. It works.
  2. It creates as little friction as possible with the build system inside Google, which is what we all work with on a daily basis. (By this I mean that we never have to give it a thought when we are making normal code changes.)
  3. Understanding it demands as little extra effort of us as possible.

Most of us do not work with tools like Maven in any other context, and so we will never be really familiar with its details. Bazel is at least the same basic design as our internal build system, though there are still some important differences. Other concepts, like shaded and unshaded jars, are something we definitely do not deal with anywhere else.

All of that is to paint a clear picture for you of what our concerns are, and to make it clear that you likely have developed a better understanding of our OSS build while creating your draft change than any of us has in our heads.

If you would like to create a PR, then I will be happy to review it and test it.

Most likely, if it appears to work and my more in-depth review later doesn't make make me more concerned than I am now, I'll end up tweaking it a bit, then merging it.

Thanks for the effort you put into this. We could really use more such help from the OSS community.

brad4d avatar Apr 11 '22 18:04 brad4d

I think your concerns are reasonable, which is why I didn't want to polish and submit it until someone else had a look. I haven't actively consulted any other bazel users (... mostly because I've never seen it used in the wild), but this appears to be the approach that bazelbuild/rules_jvm_external advises: using the @maven//:... rule to get the protobuf-java on the classpath instead of somehow explaining to java_export that the protobuf-java version brought in by @com_google_protobuf can be satisfied, for maven users, by this specific version. As my third bullet above suggested, this might just be a bug in google/bazel_common, I'm not sure how else to read those conflicting versions.

I think that the two outputs, shaded and unshaded are reasonable, though as someone consuming closure-compiler as a library, I'm mostly interested in the unshaded output. Shaded output makes more sense I suspect when consuming the output as a binary (and as such, it is the java_binary rule that emits the jar, plus the jarjar rule to rename some packages defensively).

Two last thoughts before I put together a PR: There are already lots of maven_import rules, so I don't feel terrible in adding one more, but perhaps I can find a different way to reference it. Second, I tried to make sure that the rules I touched/added only fed maven specific details into those sonatype bundled. The main place I failed here was the compiler_tests rule - perhaps I should revert that dependency change, so as to be clear that tests must pass first and foremost by consuming "non-maven" dependencies?

niloc132 avatar Apr 11 '22 20:04 niloc132

Thanks for bringing up those other concerns.

I think it would be fine for you to create a PR with whatever you have right now.

Once a PR exists I can import it and then test whether it breaks anything. For example, I should be able to tell whether any of your changes breaks our GitHub actions-based CI. I'll also be able to try a dry-run external release to see if anything breaks there.

It'll also be easier for me to really look at the changes in context once it is in PR form.

brad4d avatar Apr 11 '22 20:04 brad4d

hey, is there any info what version doesn't have those extra packages ? Currently I'm facing the issue with javax package...

oleksiimiroshnyk avatar Aug 10 '23 14:08 oleksiimiroshnyk

@oleksiimiroshnyk which dependency are you using? the unshaded jar doesn't contain any direct/transitive dependencies, so shouldn't have this issue. On the other hand, you're then responsible for making sure you have compatible dependencies, but if you're using maven/gradle/ivy, that should more or less be resolved for you.

That is, use groupid=com.google.javascript, artifactId=closure-compiler-unshaded, and whichever version of release you want to.

niloc132 avatar Aug 10 '23 15:08 niloc132

@niloc132 thanks for your answer. Currently I'm having a project that is using quite old closure-compiler-unshaded v20210505 , and it has javax package that makes that project fail on startup. Before the upgrade of that dependency(I believe that was version v20180506) that project was starting and that very old version doesn't contain extra packages. I tried the latest closure-compiler-unshaded and it doesn't contain javax package any more which is good, but seems API has changed and I will need to update some code on our end. That is fine, but I want to try to minimize and wondering what is the earliest version after v20210505 that doesn't contain javax package maybe I can simply upgrade without any changes in the code required..

oleksiimiroshnyk avatar Aug 11 '23 07:08 oleksiimiroshnyk

@oleksiimiroshnyk according to my PR linked to this, the fix for this was merged in https://github.com/google/closure-compiler/commit/1291137a9a5227a3f771c6d8d13131d595a421e5, which can be found in versions v20221102 to v20230802, so start with v20221102?

My company also maintains a closure-compiler fork with a few specific fixes that might be suitable for your use case. We backported the fix and released it on top of closure-compiler's v20220502 release, if you want to try something slightly older (with a few customized changes to it). This can be found in Maven central as com.vertispan.javascript:closure-compiler-unshaded:v20220502-1. Our customizations are listed in our readme, but should be nearly compatible for most purposes.

niloc132 avatar Aug 14 '23 14:08 niloc132