gradle-node-plugin
gradle-node-plugin copied to clipboard
OutputFiles should only look at package.json files to match yarn
Currently, the YarnInstallTask
outputs contain the whole node_modules
directory - https://github.com/node-gradle/gradle-node-plugin/blob/master/src/main/groovy/com/moowork/gradle/node/yarn/YarnInstallTask.groovy#L55
I wonder if anyone really gets benefit out of this - it takes a really long time for Gradle to scan through all the files in node_modules
to do its up-to-date check (at least on Windows). From what I understand, yarn
checks for up-to-date by looking at package.json
files only, not every single source file. Would it make sense that by default, this plugin also only adds package.json
to the output files?
Out of curiosity, how many files do you have in node_modules
and how long does the scanning take?
Running find .//. ! -name . -print | grep -c //
I get 89809
files. The package.json
is here. The Gradle up-to-date check takes about a minute. yarn
itself finishes an up-to-date check in a couple of seconds.
This is one of the smaller projects I work on, I would imagine many have an order of magnitude more number of files.
I can work around by excluding files like here so personally am fine with it
https://github.com/line/armeria/pull/2280/files#diff-d5a4015ce1eb6ef9e5374a6cfc4d28a7
But I'm wondering if the current behavior is a reasonable default - I'd expect large node_modules
to be the norm, not exception and people should be running into this. Might even consider disabling up-to-date on YarnInstallTask
completely - in general yarn
will know best how to handle up-to-date for what it does, and it always runs very fast.
The project I am working on contains about 70 000 files (according to the comment in your command above). The up-to-date check is really fast on my laptop (less than 1 second when nothing changed when reusing a Gradle daemon, about 30 seconds with a fresh Gradle instance).
I wonder why I don't get the same behavior as you. I use NPM and not Yarn but it should not change anything since the up-to-date detection does not depend on the task's behavior.
Is the up-to-date detection really fast with your workaround?
Even if Yarn handles the up-to-date checking faster, it still makes sense to declare the node_modules
directory as an output since it is true. If we don't do it, if you manually remove the node_modules
directory, Yarn will not run, whereas it should.
Note that we introduced the option you use in your workaround to workaround another issue described in #38.
Did you try on Windows? I suspect it only affects that - Windows file systems don't have mtime
so Gradle needs to hash all files.
Up-to-date detection is quite fast (~1s) when restricted to package.json
files. While it could make sense to do such a restriction only on Windows, given yarn does this on all OS's, I don't see a reason not to match its behavior.
The alternative idea for not declaring node_modules
is to not declare any inputs/outputs or upToDateWhen { false }
, so removal of node_modules
would also still allow it to run. This is because it's likely yarn would always do a better check than Gradle as it has domain-specific knowledge about its task (in this case that checking only package.json tends to be "good enough"). But I don't think this has huge advantages over tweaking the current Gradle outputs to restrict to just looking at package.json.
I don't use Windows, it is probably the reason why I don't get this behavior.
As far as I understand, you suggest to disable the up-to-date detection. This will cause the task to run always, but it is not a big deal since yarn is quite fast (a few seconds according to you).
Should we do that always or only for Windows? Should that be the default behavior or configurable?
My recommendation is to disable up-to-date detection by default on all systems, to have standard behavior on them, but make it configurable in case users do want to tweak it.
My alternative recommendation if this doesn't sound great is to updated the current OutputFiles
output to filter to package.json
only which is similar to what yarn does and should be reasonable too.
Hope that makes sense :)
@anuraaga which version of the plugin do you use?
The version 2.2.0 introduced a big performance drop due to up-to-date checking which is due to the fact we now declare a @OutputFiles
file tree instead of an @OutputDirectory
file.
If you are using the version 2.2.0, could you please test the 2.1.1 to see whether it is better or not?
I don't know about the limitations regarding up-to-date detection on Windows (due to the file system). Maybe this will not change anything because Gradle has the same behavior on Windows regarding @OutputFiles
and @OutputDirectory
?