pacote icon indicating copy to clipboard operation
pacote copied to clipboard

git: support some form of integrity

Open RoboPhred opened this issue 6 years ago • 3 comments

Currently, pacote has disabled integrity for git repos, which causes npm to re-download then uninstall every git dependency on specific-package installs. This makes npm@5 extremely painful for in-house development.

This was introduced as "fix(git): stop generating integrity for git" in https://github.com/zkat/pacote/commit/d45363b605589dae735f126a3ea19b2bf0165caf, as part of https://github.com/zkat/pacote/pull/127

I have experimentally allowed pacote to generate integrity checksums on a local copy of npm. So far it fixes the issue without apparent side effects, but I am pretty sure I must be missing something. The pull request mentions ongoing issues, but I have been unable to find what this may be in reference to.

The only indication I have found so far is in a comment about shrinkwrap explaining that integrity for tarballs is not consistent, but so far I have not seen this in practice and do not know how to verify it either way.

Can you help me understand why this change was made? I am not sure whether the solution to npm's issue is "pacote should support git integrity" or "npm should allow packages without integrity".

Perhaps an alternate is to use the specific commit reference as the integrity string? Is that string contracted to be a checksum of the files, or will any consistently applied value work?

RoboPhred avatar Jan 18 '18 14:01 RoboPhred

@iarna and I are in the process of figuring out a way forward for git dependencies. This whole thing's a bit tricky, but the crux of it is that, unlike with registry deps, we can't use integrity-based verification for git deps: the integrity string is generated from the packed tarball after a git dependency has been built, and there's literally no guarantee that two different builds off the same git SHA would result in the same integrity string.

The thing we actually need is a better way to detect whether git deps are acceptably up to date, and that mostly has to do with figuring out the right layers to put the right markers into. @iarna can probably go into more detail than I can, since I think she went off to try to work on this last we spoke about it.

The tl;dr is:

  • package.json has foo/bar#my-branch
  • package-lock.json has git://github.com/foo/bar#deadbeef (resolved SHA)
  • We need a good way to look at node_modules (and package-lock.json) and go "ah, bar was installed using foo/bar#my-branch, and resolved to ...#deadbeef, and that's indeed what's installed, so I don't need to update (package-lock.json or node_modules/bar).

It sounds easier than it is, I think especially so because npm has to make sure both the node_modules/ version and the package-lock.json version actually correspond, and hopefully do so without hitting the network.

zkat avatar Jan 19 '18 10:01 zkat

While I do understand some problems are harder than they seem, generating and tracking proper cache for git dependencies is not the main issue here. A cache miss should not result in packages being removed, it should result in packages being fetched 100% of the time.

This is fundamentally broken in a way that makes it extremely painful to work with git dependencies. Why is such behavior preferred rather than making sure every dependency is installed correctly, even if it takes longer?

saboya avatar Feb 14 '18 15:02 saboya

@saboya the cache miss isn't what's causing the removal: it's a bug in the way npm itself recognizes and handles git dependencies.

zkat avatar Feb 14 '18 16:02 zkat