code-server
code-server copied to clipboard
fix: Generate shrinkwraps for the bundled vscode
Fixes #5056
Why are we removing
synpafter all?
Because it only takes care of putting in the package-lock/shrinkwrap file the first level of dependencies, not the nested dependencies which we have from the yarn lockfile.
See the contents of https://unpkg.com/browse/[email protected]/npm-shrinkwrap.json, and how for example, @mapbox/node-pre-gyp is listed as a dependency of argon2 - but is not present anywhere in the shrinkwrap file.
Why are we adding packages from
resolutionsindevDependenciestoo?
What initially drove me to use synp was the fact that the npm shrinkwrap command would generate a lockfile that wasn't usable. Trying to install the package with that lockfile would fail with the following exception:
Installing Code dependencies...
[..]
npm ERR! Cannot read property 'requires' of undefined
This was actually due to npm shrinkwrap (in NPM 6) choking with some packages not being anywhere in the lockfile but being present in the package.json.
So I simply tracked down their dependency tree, saw they were coming from devDependencies, and added them there - which now prevents the npm shrinkwrap-generated lockfile from breaking, and allows us to remove synp. This doesn't change the size of the dependency tree, and should be able to be removed when there are higher versions of NPM used.
Why aren't we using
npm cito replaceyarn --frozen-lockfile?
NPM 6's npm ci doesn't deal very well with optional dependencies, which the bundled vscode needs for some Windows dependencies (namely, windows-process-tree and @vscode/windows-registry):
- The
npm shrinkwrapcommand doesn't add them to the lockfile, because the CI runs from Linux, so it doesn't install them - and the commands generates the lockfile based on what's innode_modules. - If we hacked the lockfile and added them there manually (with
"optional": true), the install on a Linux machine would fail - because NPM 6'snpm cidoesn't respect them from annpm-shrinkwrap.jsonfile. - If we don't add them and use
npm ci, they are not installed - even on Windows machines.
What I settled for was using npm install --production (for now), that will always respect the lockfile version if an entry is there, and will try to get versions not listed there. This effectively means that those two packages, on windows, won't be following the yarn.lock that is being used in the development package, but that seemed less worse.
Note: Another idea was to hackily add the packages to the lockfile + enhance the postinstall.sh script to conditionally use --no-optional if we're running on a non-Windows machine - which skips those optional dependencies. I wasn't convinced by either the hacking around of the lockfile nor this conditional... But happy to hear thoughts?
Why don't we need the "Remove the
devDependencies" hack anymore?
Because we are now generating the shrinkwrap file after stuff has been installed with yarn --production, which means npm shrinkwrap will only list the production dependencies.
Why are we removing
jsonafter all? Because we don't need to remove thedevDependenciesanymore - which needed some more complex JSON mangling. Now we simply need to remove a line, which can be done withsed.
But also, because doing it with json gets a bit tricky, as json would need to be listed as an actual dependency (rather than devDependency) of code-server and vscode too to be able to be used... Not worth it.
Switching the jq commands to json might still be worth it, but no need to conflate the commits.
Why no more
yarn packartifact? Realized I was trying to fix/conflate too many things into one PR, so will likely create a different issue to discuss this one.
Codecov Report
Merging #5071 (b1382c8) into main (33ee184) will not change coverage. The diff coverage is
n/a.
:exclamation: Current head b1382c8 differs from pull request most recent head 5c0cfdc. Consider uploading reports for the commit 5c0cfdc to get more accurate results
@@ Coverage Diff @@
## main #5071 +/- ##
=======================================
Coverage 72.44% 72.44%
=======================================
Files 30 30
Lines 1673 1673
Branches 366 366
=======================================
Hits 1212 1212
Misses 398 398
Partials 63 63
Continue to review full report at Codecov.
Legend - Click here to learn more
Ξ = absolute <relative> (impact),ΓΈ = not affected,? = missing dataPowered by Codecov. Last update 33ee184...5c0cfdc. Read the comment docs.
So the failure seems to be because of some dependencies missing in the node_modules from the original install... Which my guess is because something is failing to be installed. Will be playing around with it (expect a bunch of updates to this PR to "test" the CI) and try to find the culprit.
GitHub seems to have ignored my email response so I will paste it below, apologies if it does eventually come in and the comment gets duplicated:
I love the detailed description!
What is used to install is completely meaningless to the end-user. If only, by not using yarn even when invoked with yarn, we can guarantee the deterministic versions for the bundled vscode. Thoughts?
My first thought when I mentioned this was that the user might have
yarn installed but not npm so switching the package manager from
under the user could cause installation failures (npm: command not found
or something like that).
Buuuut...is it likely someone has Node installed but not npm? I know some distros package them separately but it still seems unlikely.
The second thought I had was that using the same package manager throughout the entire installation process /feels/ more correct to me. I doubt it would cause issues in practice but it seems like it would be less of a surprise ("Wait why is it calling npm when I ran yarn?").
Ok new tentative out, with:
- Using
npmoryarn, depending on what was initially used. - Generating the shrinkwraps when generating the release, rather than initially.
- Removing
@parcel/node-addon-apifrom the shrinkwrap, as it's a folder innode_moduleswhich is actually not a package and was making the whole process choke.
(1) and (2) together (should) also help the other issue I was trying to track down, which happens because the tests were failing for the standalone release as it wasn't installing some of the devDependencies that we were removing too early from the shrinkwrap file.
FWIW, I'm playing around in https://github.com/edvincent/code-server/pull/1#partial-pull-merging to be able to run the CI/CD and pinpoint some of the culprits. To avoid the flood and needing approvals for the workflows to run. Sorry it's taking so long, but trying to do it the best possible way this time πͺ
Ok, I think I got it this time! Moved to generate the shrinkwraps in the release:standalone command, which basically allows not having to know which package.json is being copied etc...
Also updated the overview of this PR to cover some of the reasoning.
Thoughts? I think I still need to add something so that the NPM publish CI step calls this release:standalone too?
Awesome work! I love your comments; very thorough and made it super easy to understand the rationale.
I think I still need to add something so that the NPM publish CI step calls this release:standalone too?
I think leaving it in build-release.sh is the right move. We use the result of that script in both the NPM publish step and standalone step.
Ohh I see in your PR description you are doing it there because it runs after yarn --production. Hmm...
Yep.. We need the node_modules to be populated to be able to generate the shrinkwrap file...
The other solution would be to still do it in build-release.sh, which means we'd need to yarn install in build-release.sh. That would address this TODO https://github.com/coder/code-server/blob/main/.github/workflows/ci.yaml#L240-L242 - and the NPM release would move to be a yarn pack (rather than a raw tar command). But I'm not sure how would having the node_modules folder from one arch would affect another arch - for the cross-compile sections...
Unless we do it in build-release.sh, with a yarn install, but then generate two type of artifacts - the current one (for the other platforms) and the current one...
Thoughts?
Maybe we could just add yarn --production to build-release.sh and it would not take too much longer since it only needs to remove files (but I have no idea how yarn works internally so maybe this takes longer for some reason).
We need the node_modules to be populated
Oh right we have no Node modules at all at that stage, we would need to remove that whole KEEP_MODULES behavior and just keep the modules (in addition to then running yarn --production at the end if I understand correctly).
But I'm not sure how would having the node_modules folder from one arch would affect another arch
This is exactly why I have avoided doing that todo so far lol
Today there is never a yarn in build-release.sh - so a yarn --production won't be removing files, but also installing stuff that was never installed.
The only yarn that gets done is at the root of the repo, not inside the release folder no?
EDIT: Yep, what you just said now. Happy to try and take a stab at that if it seems to be the best.
Today there is never a yarn in build-release.sh
Ah yup it is done in a previous step and then this step has an option (KEEP_MODULES) for whether to also copy the Node modules or skip them.
Yeah I say we go ahead with this plan.
Damn it's actually a bit more nuanced than just removing the KEEP_MODULES behavior... If we run yarn --production, it would also install some of the things we were excluding in the rsync like the node executable, the node_modules.ara and the lib/coder-cloud-agent. Which means the tar -czf command in the ci.yaml would need to replicate/hard-code some of those excludes...
The other solution would be to come back to the original idea of generating the shrinkwrap files from the code-server repo itself, and copy them into release (after removing the devDependencies from it). i.e. we'd need to cd into lib/vscode/remote/ - rather than relying on the fact we copied the right package.json.
BTW - Right now, that KEEP_MODULES never seems used?
KEEP_MODULES never seems used
Yup, from what I recall initially we never kept the modules then later KEEP_MODULES was added to make building locally faster (so you would not have to yarn twice if you wanted to test a full build locally). So current the var is only ever used during local development.
it would also install some of the things we were excluding
I think it will be OK to let these get packaged into the tar although I suppose they will just end up getting overwritten later. We could create a new var KEEP_BINARIES or something which acts like the old var except we are keeping Node modules by default now?
Er wait I might have misunderstood. We do already run yarn --production so those things exist in the standalone already but now the npm package will include them when it did not before maybe? In that case maybe we add to the npmignore.
https://github.com/coder/code-server/blob/c35bf1311e85d031e40ca42fe0342a47c2c254a8/ci/steps/publish-npm.sh#L86
No, I haven't forgotten about this/you folks β€οΈ life got in the way...
I'm back trying to fix this, sent a new approach PR (which gets a bit easier thanks to NodeJS 16!). Still a WIP and needs some tweaks, just sharing for awareness. Testing in https://github.com/edvincent/code-server/actions/runs/2883629793
HOWEVER, something that is a bit of a concern to me now seems to be a bug in NPM itself... Despite the shrinkwrap file, installing with the global flag still doesn't seem to respect them π
FYI - opened https://github.com/npm/cli/issues/5325 to track that nasty bug about the shrinkwrap file not being respected.
I'll still continue with the PR here (need to clean-up some of the code/options, I used --no-default-rc everywhere but supposed to only be for one of the folders) and we likely will be able to merge it - as the result will be the same as the current state: every install installs the latest version. But as soon as the bug upstream will be fixed, stuff will start working with no action :p
Any thoughts on the approach proposed in the latest code?
Ok... This is what it looks like when doing it from the release script... Which is indeed where it's needed, as the tgz we use to publish are the artifacts from it - not from release standalone. Confirmed from the artifacts the shrinkwrap files and their content from downloading the build artifacts.
It looks a bit worse because shrinkwrapping tries to alter the yarn.lock - so we have to keep it and replace it again. But seems to be the least-worst way - the other being doing a release:standalone before publishing, which would need to install again all the development
dependencies etc...
Everything looks good to me so I am going to merge it in. Thanks a million!
@edvincent really can't thank you enough - it's rare someone from the community takes on a hard problem, takes a break because life, then comes back to finish it πͺπΌ We appreciate you a ton!
Anytime! I like giving back to the community too, especially for stuff I use daily! Plus I got to learn a LOT about NPM's dependency management, which I enjoyed. And thank to both of you for bearing with me during the (long) process.
You can also thank @bpmct and the nice t-shirt he sent me couple of months ago, it definitely played a role in reminding me I still had this PR pending every time I wore it. π
Will still send a follow-up PR (likely next week) for the documentation to remind folks to use npm rather than yarn.
Also, whilst confirming the fix of https://github.com/coder/code-server/issues/5174#event-7234295102, realized that the bug https://github.com/npm/cli/issues/5325 is actually less of an issue here - because it doesn't seem to be buggy when installing it from a remote repo!
Which means... There now is a way to install code-server with fully-deterministic dependencies, which will be the same whether npm or a packaged version is used. ππ
Awesome news!!
I love the detailed description!
What is used to install is completely meaningless to the end-user. If only, by not using yarn even when invoked with yarn, we can guarantee the deterministic versions for the bundled vscode. Thoughts?
My first thought when I mentioned this was that the user might have
yarn installed but not npm so switching the package manager from
under the user could cause installation failures (npm: command not found or something like that).
Buuuut...is it likely someone has Node installed but not npm? I know some distros package them separately but it still seems unlikely.
The second thought I had was that using the same package manager throughout the entire installation process /feels/ more correct to me. I doubt it would cause issues in practice but it seems like it would be less of a surprise ("Wait why is it calling npm when I ran yarn?").
Ignore this lol GitHub seems to finally be posting comments I made via email months ago