heroku-buildpack-nodejs icon indicating copy to clipboard operation
heroku-buildpack-nodejs copied to clipboard

Yarn rebuilds native dependencies during pruning

Open edmorley opened this issue 6 years ago • 5 comments

Hi!

For one of our projects, the durations of the nodejs buildpack part of our compile are as follows (all times are with zero changes to yarn.lock since the last build):

  • Prior to #519: 16 seconds (log excerpt)
  • After #519 (and after an interim build to warm the cache with devDeps): 42 seconds (log excerpt)
  • After #519 and with setting YARN_PRODUCTION=true + interim build: 45 seconds (log excerpt)

From both reading the last of the linked logs and from looking back over the PR, It seems that even with YARN_PRODUCTION=true, the pruning step still happens (and in our case takes quite a while - likely in part due to yarnpkg/yarn#932).

Is this intended? I would have thought that for those installing only dependencies (via YARN_PRODUCTION=true), the prune step would not need to run at all?

Many thanks :-)

edmorley avatar Mar 01 '18 22:03 edmorley

Thanks for all your feedback @edmorley :)

Is most / all the extra time is coming from the prune step compiling native dependencies?

p95 and p99 seem unaffected, but we have seen a spike in median build time (change pushed around 20:30):

lang buildpack librato 2018-03-01 14-54-57

I was unaware that yarn would rebuild native dependencies in this case, but that does make sense. I'll see what the options are for fixing this.

jmorrell avatar Mar 01 '18 23:03 jmorrell

Is most / all the extra time is coming from the prune step compiling native dependencies?

The breakdown is as follows (using the timestamps in the log)...

Pre-#519 (log excerpt):

  • "Restoring cache": 2s
  • "Building dependencies": 2s (no-op)
  • "Caching build": 7s

With #519 and no environment variable overrides (log excerpt):

  • "Restoring cache": 3s
  • "Building dependencies": 2s (no-op)
  • "Caching build": 8s
  • "Pruning devDependencies": 25s

With #519 and YARN_PRODUCTION=true (log excerpt):

  • "Restoring cache": 2s
  • "Building dependencies": 2s (no-op)
  • "Caching build": 7s
  • "Pruning devDependencies": 29s

I was unaware that yarn would rebuild native dependencies in this case, but that does make sense. I'll see what the options are for fixing this.

Ah so it seems to be due to the use of --ignore-scripts in the prune...

STR:

  1. git clone https://github.com/mozilla/treeherder && cd treeherder
  2. export NODE_ENV=production
  3. yarn install --production=true --frozen-lockfile --ignore-engines (this is what's run when YARN_PRODUCTION=true)
  4. Repeat step 3 (to confirm it's now a no-op)
  5. cp node_modules/.yarn-integrity ./.yarn-integrity-original
  6. yarn install --frozen-lockfile --ignore-engines --ignore-scripts --prefer-offline (what's run for the prune)
  7. diff -U3 .yarn-integrity-original node_modules/.yarn-integrity

Expected:

  • Step 6 is a no-op like step 4.
  • The yarn integrity file is identical at step 7.

Actual:

  • Step 6 is not a no-op.
  • Step 7 shows the integrity file has changed (which causes yarn to re-check dependencies):
--- .yarn-integrity-original
+++ node_modules/.yarn-integrity
@@ -4,6 +4,7 @@
     "node_modules"
   ],
   "flags": [
+    "ignoreScripts",
     "production"
   ],
   "linkedModules": [],

It does seem reasonable for yarn to store the flags used in the integrity file -- since in the reverse case (ie: yarn install --ignore-scripts && yarn install) the second install definitely should run again.

As such I think there are two things to consider tweaking:

  1. Should the --ignore-scripts flag be removed from the prune? (I can see why it's used since packages are just being removed, so why waste time on the scripts - but if it causes complete invalidation of the integrity file, it may be a net loss in many cases?)
  2. For people using YARN_PRODUCTION=true, can we skip the prune step entirely? (Since it should just be a no-op anyway)

edmorley avatar Mar 02 '18 12:03 edmorley

We are seeing much higher pruning times for yarn than npm, and I suspect that this is why. There are binary dependencies lurking in many build trees.

lang node js librato 2018-03-02 10-06-12
  1. Removing --ignore-scripts is difficult since many kick off their build step in postinstall, something that we have (I think wrongly) recommended as a best practice.

  2. Absolutely, I'll tee up that change.

jmorrell avatar Mar 02 '18 18:03 jmorrell

Removing --ignore-scripts is difficult since many kick off their build step in postinstall

Ah of course - I was thinking the lifecycle scripts of the deps themselves, but there's also the root package.json's scripts which are likely more invasive, so I agree --ignore-scripts probably has to stay.

That said with (2) fixed, the only people running prune will be those who actually have something to prune - so some kind of invalidation would be happening regardless of the flags change to .yarn-integrity - so perhaps it's a non-issue.

edmorley avatar Mar 02 '18 18:03 edmorley

At the very least, this only results in a longer install and not a broken install, and it's something the yarn team is aware of https://github.com/yarnpkg/yarn/issues/932

jmorrell avatar Mar 02 '18 19:03 jmorrell