frontend-gradle-plugin icon indicating copy to clipboard operation
frontend-gradle-plugin copied to clipboard

Reduce unnecessary downloads and reinstalls

Open kris7t opened this issue 3 years ago • 2 comments

Hi,

First of all, thank you for the awesome plugin!

I am using the version 6.0.0 of the plugin with yarn 3.1.0. I noticed that some tasks run and download tools at every build, which make builds somewhat slow. I came up with some workarounds as shown below. I am unsure how useful they would generally be for the plugin, but if any of them are interesting, I am willing to try and create a PR with similar functionality (although I'm a beginner in gradle plugin writing, so that may take some time).

Nodejs is downloaded every time the classpath of the installNode task changes

I am using precompiled script plugins from the buildSrc directory to enforce some project-wide conventions. Every time I change a convention .gradle file, the buildSrc project is recompiled, which changes the classpath of the installNode task (since it comes from the implementation classpath of my buildSrc project, transitively). This results in node being redownloaded even if the desired node version wasn't changed.

I couldn't really come up with a clean workaround for this, short of storing the installed node version explicitly in a .properties file:

https://github.com/kris7t/refinery/blob/a530fe054944ca28bb49985c4334de20bca9c24c/buildSrc/src/main/groovy/refinery-frontend-worktree.gradle#L10-L62

Yarn 1 is downloaded at every build

Running the npm install -g yarn command at every build seems to trigger an installation even if yarn 1 was already installed. Moreover, the repeated installs mean that yarn 1 is updated to the latest version every time, which may hurt reproducibility.

As a workaround, we may pin a specific yarn 1 version and store it in a .properties file to see whether we need to reinstall:

https://github.com/kris7t/refinery/blob/a530fe054944ca28bb49985c4334de20bca9c24c/buildSrc/src/main/groovy/refinery-frontend-worktree.gradle#L6

https://github.com/kris7t/refinery/blob/a530fe054944ca28bb49985c4334de20bca9c24c/buildSrc/src/main/groovy/refinery-frontend-worktree.gradle#L64-L72

I guess both this and the previous workaround is generally useful unless the user is using a global node installation (when no node or yarn 1 should be downloaded automatically).

We download and switch to yarn berry at every build

As far as I could determine, switching to yarn berry is required before switching to a specific version if there is no .yarn/versions/yarn-{version}.cjs is the repository because yarn 1 will just refuse to switch to a specific version otherwise.

However, yarn 3 recommends storing the yarn-{version}.cjs in the repository anyways, so we can get away with disabling the enableYarnBerry task:

https://github.com/kris7t/refinery/blob/releng-docs/buildSrc/src/main/groovy/refinery-frontend-conventions.gradle#L12-L14

I am unsure how generally useful this workaround is. Maybe we should explicitly recommend storing yarn 3 in the repository (as upstream does), or at least provide a configuration option whether the user wants to re-create .yarn at every build (as opposed to upstream recommendation) or not?

We download and switch to the desired yarn version at every build

Since yarn 3 is usually stored in the repository, there is no need to re-download it at every build. The exception is when the desired yarn 3 version changes compared to what's in the repository.

Luckily, the yarn 3 version is in its filename, so there's no need to cache it in a .properties file. Instead, we can just mark it as an output of the installYarn task:

https://github.com/kris7t/refinery/blob/a530fe054944ca28bb49985c4334de20bca9c24c/buildSrc/src/main/groovy/refinery-frontend-worktree.gradle#L74-L76

We can further reduce re-downloads (for example, in installYarn is forced to re-run due to classpath changes) by using yarn set version --only-if-needed:

https://github.com/kris7t/refinery/blob/a530fe054944ca28bb49985c4334de20bca9c24c/buildSrc/src/main/groovy/refinery-frontend-worktree.gradle#L7

We reinstall npm packages at every build

This doesn't cause superfluous downloads, because yarn caches all packages, but installFrontend is still a bit slow when it re-calculates package resolutions even if no package.json or yarn.lock file changed.

Thanks to pnp resolution, the installed state of packages is fully described by the .pnp.cjs and .pnp.loader.mjs files. Compared to traversing the node_modules directory (which was the norm with npm and yarn 1), checking the dates on these files can be done quickly so we can mark them as inputs and outputs:

https://github.com/kris7t/refinery/blob/a530fe054944ca28bb49985c4334de20bca9c24c/buildSrc/src/main/groovy/refinery-frontend-worktree.gradle#L78-L81

This workaround is only useful if pnp resolution is used. Perhaps we could add a configuration option for that, too?

This introduced a lot of caching. I wanted to provide an easy escape hatch if something breaks (although hopefully nothing will), so I added a clobberFrontend task that nukes the node a .yarn directories:

https://github.com/kris7t/refinery/blob/a530fe054944ca28bb49985c4334de20bca9c24c/buildSrc/src/main/groovy/refinery-frontend-worktree.gradle#L83-L91

What to keep / not keep was based on the recommendations in the yarn documentation.

kris7t avatar Nov 21 '21 15:11 kris7t

Hi,

Thank you for these interesting questions/suggestions. It is extremely difficult to deal efficiently with all these suggestions with a single issue. Please can you:

  • Open a dedicated issue for each of the 5 topics you mentioned so as we can focus on each one independently?
  • Fill the form requested to report an issue?

Thanks in advance, and thank you for using the plugin 😊. BR

v1nc3n4 avatar Dec 05 '21 14:12 v1nc3n4

Just for documentation, I added the following to my build.gradle.kts:

frontend {
    nodeVersion.set("14.17.5")
    yarnEnabled.set(true)
    yarnVersion.set("3.1.1")
    assembleScript.set("run codegen")
}

tasks.installYarnGlobally {
    outputs.dir("node/lib/node_modules/yarn")
}

tasks.enableYarnBerry {
    inputs.property("yarnVersion", frontend.yarnVersion)
    outputs.file(frontend.yarnVersion.map { ".yarn/releases/yarn-$it.cjs" })
}

tasks.installYarn {
    inputs.property("yarnVersion", frontend.yarnVersion)
    outputs.file(frontend.yarnVersion.map { ".yarn/releases/yarn-$it.cjs" })
}

tasks.installFrontend {
    inputs.file("package.json")
    inputs.file("yarn.lock")
    outputs.dir("node_modules")
}

tasks.assembleFrontend {
    inputs.file("codegen.yml") // project dependent
    inputs.file("package.json")
    inputs.file("yarn.lock")
    inputs.files(fileTree("src/main/resources/") { // project dependent
        include("**/*.graphqls")
    })
    outputs.dir("build/graphql-code-generator/kotlin") // project dependent, required
}

I use the plugin to run yarn run codegen, to run the graphql-code-generator in my Kotlin/Java project.

tristanlins avatar Jan 14 '22 06:01 tristanlins

@tristanlins I'm not sure this could work with yarn.

As a reminder gradle can cache tasks only if inputs and outputs are properly documented, and there's no overlap between tasks. A task is cacheable if it is annotated by @CacheableTask or via the API outputs.cacheIf { true }.

The task installYarnGlobally will run npm install -g yarn do not have a version, so one may end up with a newer version. And depending on where the destination is set the task might not be cacheable

And when the enableYarnBerry is run then the global version lands in the files in the current dir

.yarn/releases/yarn-3.3.0.cjs
.yarnrc.yaml        (yarnPath: .yarn/releases/yarn-3.3.0.cjs)

But then the installYarn task runs it will use the version defined in the frontend section. Which may not be the same as the version coming from the global install. E.g. this will create a file

.yarn/releases/yarn-3.1.0.cjs

But this will "polute" the folder and on the next run gradle will notice the output has changed (overlap), and this will make the task run again.

bric3 avatar Dec 08 '22 12:12 bric3

Hello @kris7t,

Hereafter you will find some news about your questions.

Nodejs is downloaded every time the classpath of the installNode task changes

The Node.js distribution is downloaded again because some content in the buildSrc directory is modified, whatever it is. As stated in Gradle docs: "A change in buildSrc causes the whole project to become out-of-date.". This is a behaviour by design and the plugin will not workaround this. I understand this may be frustrating on your side, but what about refactoring your build and move parts you are frequently updating outside the buildSrc directory?

Below you will find some articles I found, explaining why the buildSrc should be replaced by composite builds, for the exact same problem (build cache invalidation):

  • https://proandroiddev.com/stop-using-gradle-buildsrc-use-composite-builds-instead-3c38ac7a2ab3
  • https://medium.com/bumble-tech/how-to-use-composite-builds-as-a-replacement-of-buildsrc-in-gradle-64ff99344b58

Gradle docs here also mention an interesting improvement: "Making the implicit buildSrc project an included build.", which, as I understand it, could solve this problem in your case.

Yarn 1 is downloaded at every build We download and switch to yarn berry at every build We download and switch to the desired yarn version at every build

Starting from release 7+ and the use of Node.js corepack utility, the optimization is now implemented.

We reinstall npm packages at every build

This has been discussed in issues #180, #60, and in the official documentation of task installFrontend. The plugin abstracts installation and use of supported package managers. Moreover, depending on the package manager, inputs and outputs are not the same, and may vary depending on the project and the state of the file system. I understand this can be frustrating not to have a built-in optimization during install. With my actual knowledge, it appears to me too much complex to abstract this optimization, so as it works in all possible configurations.

However, since this point has been questioned many times:

  • The documentation of task installFrontend now explain what could be inputs and outputs for each package manager, to help configure optimization depending on the project and the build case. But this does not replace official documentation of each package manager.
  • Examples were refactored and introduce a configuration proposal of a basic and naive optimization for each package manager. These are probably not ideal solutions and I will be happy to read suggestions for all-purpose improvements.

Thanks to pnp resolution, the installed state of packages is fully described by the .pnp.cjs and .pnp.loader.mjs files

I didn't find some documentation about the .pnp.loader.mjs file. I'll be happy to learn more if you can share some official notes.


Given your questions and these clarifications, I close the issue. But discussion may continue if needed or if I'm mistaken. Thank you for your help! BR

v1nc3n4 avatar Aug 24 '23 13:08 v1nc3n4