cli icon indicating copy to clipboard operation
cli copied to clipboard

[BUG] `npm pack` doesn't match directory prefiex (`dist-*` doesn't match `dist-cjs/index.js`)

Open everett1992 opened this issue 1 year ago • 20 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

This issue exists in the latest npm version

  • [X] I am using the latest npm

Current Behavior

With npm 10 files: ["dist-*"] does not include dist-es or dist-es/index.js. This is changed from npm 8, not clearly documented, and inconsistent with other patterns.

Expected Behavior

I expect the same behavior as npm 8. I don't see anything in the change logs that would explain this difference.

https://docs.npmjs.com/cli/v9/using-npm/changelog#%EF%B8%8F-breaking-changes-2

npm pack now follows a strict order of operations when applying ignore rules. If a files array is present in the package.json, then rules in .gitignore and .npmignore files from the root will be ignored.

This bug occurs when there's only a files array, there's no .gitignore or .npmignore to ignore (or respect in npm 8)

Steps To Reproduce

Setup a test project

.
├── dist-cjs
│   └── index.js
├── dist-es
│   └── index.js
├── dist-other.js
└── package.json
# package.json
{ "name": "test", "version": "1.0.0" }

This test script will print the differences between files included by npm 8 and 10

#!/bin/bash
set -euo pipefail
patterns=('null'  '[]' '["*"]' '["dist-*"]' '["dist-**"]' '["dist-*/*.js"]')
for files in "${patterns[@]}"; do
  npm pkg set files="$files" --json
  echo "8 ($files)	10 ($files)	both"
  comm \
    <(npx npm@8 pack --dry-run --json | jq '.[].files[].path' -r) \
    <(npx npm@10 pack --dry-run --json | jq '.[].files[].path' -r)
done

Results

8 (null)            | 10 (null)            | both
                    |                      | dist-cjs/index.js
                    |                      | dist-es/index.js
                    |                      | dist-other.js
                    |                      | package.json
8 ([])              | 10 ([])              | both
                    |                      | package.json
8 (["*"])           | 10 (["*"])           | both
                    |                      | dist-cjs/index.js
                    |                      | dist-es/index.js
                    |                      | dist-other.js
                    |                      | package.json
8 (["dist-*"])      | 10 (["dist-*"])      | both
dist-cjs/index.js   |                      |
dist-es/index.js    |                      |
                    |                      | dist-other.js
                    |                      | package.json
8 (["dist-*/"])     | 10 (["dist-*/"])     | both
dist-cjs/index.js   |                      |
dist-es/index.js    |                      |
                    |                      | package.json
8 (["dist-**"])     | 10 (["dist-**"])     | both
dist-cjs/index.js   |                      |
dist-es/index.js    |                      |
                    |                      | dist-other.js
                    |                      | package.json
8 (["dist-**/"])    | 10 (["dist-**/"])    | both
dist-cjs/index.js   |                      |
dist-es/index.js    |                      |
                    |                      | package.json
8 (["dist-*/*.js"]) | 10 (["dist-*/*.js"]) | both
                    |                      | dist-cjs/index.js
                    |                      | dist-es/index.js
                    |                      | package.json

Environment

  • npm: 10.7.0 (vs 8.19.4)
  • Node.js: 22.00
  • OS Name: Ubuntu
  • System Model Name: N/A
  • npm config:
; node bin location =~/.local/share/rtx/installs/node/22.0.0/bin/node
; node version = v22.0.0
; npm local prefix = /tmp/npm-pack-test
; npm version = 10.5.1
; cwd = /tmp/npm-pack-test
; HOME = ~
; Run `npm config ls -l` to show all defaults.

everett1992 avatar May 13 '24 01:05 everett1992

It seems like npm 10 will only treat exactly * as a directory wildcard, so * or lib/* will match directories, but dist-* or *.ts will only match files.

everett1992 avatar May 13 '24 01:05 everett1992

Okay, there's a closed npm/cli issue saying it's fixed https://github.com/npm/cli/issues/6164, and a npm-packlist issue tracking other packlist changes https://github.com/npm/npm-packlist/issues/152

The fix they reference https://github.com/npm/npm-packlist/pull/147 suggests that packlist should treat * the same as **, however dist-**/ doesn't match dist-es/index.js either.

everett1992 avatar May 13 '24 01:05 everett1992

interestingly dist-*/* does match and has the same packlist between 8 and 10.

everett1992 avatar May 13 '24 01:05 everett1992

Yes it's not very well documented, but the way "globstar" is treated got aligned more with standard glob implementations quite some time ago. Not sure what the long term solution will be here, obviously trying to "patch" back the old behavior has been buggy. Your best bet is to fix your globs.

wraithgar avatar May 13 '24 14:05 wraithgar

Added a note to our v11 roadmap to unwind all of these hacks.

wraithgar avatar May 13 '24 14:05 wraithgar

I'm in the position of upgrading the company wide npm version, this is one of the few changes that's causing wide spread differences. It's mainly a matter of documenting the change and explaining how to write equivalent npm 10 patterns.

I thin npm 8's ["dist-*"] needs to become ["dist-*", "dist-*/**"]

everett1992 avatar May 13 '24 15:05 everett1992

What does 'unwind all of these hacks' mean? Make it more like npm 8, or make it ' aligned more with standard glob implementations'?

IMO npm 8's behavior is not unaligned with standard globs. dist-* should include dist-cjs/index.js for the same reason that dist-cjs includes dist-cjs/index.js in npm 10 - it matched dist-cjs folder and recursively included everything in it.

An analogy, echo shows that dist-* only matches direct children

$ echo dist-*
dist-cjs dist-es dist-other.js

But when you give those arguments to find it recurses and finds the descendants.

$ find dist-*
dist-cjs
dist-cjs/index.js
dist-cjs/lib
dist-cjs/lib/index.js
dist-es
dist-es/index.js
dist-other.js

Here's another example comparing npm 8 and 10, this time without using * patterns. The directory pattern dist-es matches the files in the dir in both versions. But passing the brace pattern dist-{es,cjs} which expands to "dist-es", "dist-cjs" only matches files in the directories in npm 8.

8 (["dist-es"])       | 10 (["dist-es"])       | both
                      |                        | dist-es/index.js
                      |                        | package.json
8 (["dist-{es,cjs}"]) | 10 (["dist-{es,cjs}"]) | both
dist-cjs/index.js     |                        |
dist-cjs/lib/index.js |                        |
dist-es/index.js      |                        |
                      |                        | package.json

everett1992 avatar May 13 '24 17:05 everett1992

What does 'unwind all of these hacks' mean?

It means no longer work around the default node-glob behavior. The decision was made at the time to try to work around them because the changes weren't well documented at the time. In hindsight this was probably less ideal than adding notes to the changelog to let folks know about the new behavior.

wraithgar avatar May 13 '24 18:05 wraithgar

For what it's worth find does not use "glob".

ls does. Compare the difference between ls * and ls */*

wraithgar avatar May 13 '24 18:05 wraithgar

That is my point, dist-* expands to dist-es and dist-cjs, then npm checks those directories recursively, like find does. The expansion is flat and doesn't rely on globstar behavior.

files: ["dist-es", "dist-cjs"] will include dist-es/index.js, but files: ["dist-{es,cjs}"] won't, even tho it expands to the same thing.

I think npm 8's pattern matching was good. Requiring explicit recursive patterns would be okay. dist-cjs being recursive while dist-* is not, is bad.

  • dist-* and dist-es both matching dist-es/index.js is good (npm 8)
  • dist-es matching dist-es/index.js while dist-* does is bad (npm 10)
  • dist-* and dist-es not matching dist-es/index.js is ok (maybe what you're proposing for npm 11?)
  • dist-*/** and dist-es/** both matching dist-es/index.js is ok (maybe what you're proposing for npm 11?)

everett1992 avatar May 13 '24 22:05 everett1992

Yes those last two would be what npm 11 does.

wraithgar avatar May 14 '24 14:05 wraithgar

Will those changes apply to .gitignore and .npmignore?

So .npmignore build won't exclude every file under build in npm 11?

File patterns follow a similar syntax to .gitignore docs.npmjs.com

everett1992 avatar May 14 '24 18:05 everett1992

build should absolutely match both build and build/**/*, just like it does for git.

ljharb avatar May 14 '24 18:05 ljharb

So will there be different pattern rules for .npmignore and files? dist-* in .gitignore or .npmignore will exclude everything under dist-cjs/, while dist-* in package.json files would only include a literal file like dist-other.js while not including dist-cjs/ directories contents?

I much prefer npm 8's and .gitignore's "if a pattern matches a directory, it's files are matched" behavior to the proposed "a files pattern must exactly match the path to a file".

everett1992 avatar May 14 '24 19:05 everett1992

These questions show just how wanting we are for a proper spec here. I don't have answers today. The way it works now is not working. That's the only answer I have today.

wraithgar avatar May 14 '24 19:05 wraithgar

If you look through the history of especially map-workspaces and npm-packlist you'll see just how buggy the changes from glob have made npm. We have to work around the new behavior in every instance now, and doing it like that is almost guaranteed to produce bugs.

wraithgar avatar May 14 '24 19:05 wraithgar

I'm happy to help out with spec/doc/test writing or discussion.

Not sure if this falls under the same umbrella, but I found that symlinks are treated differently between 8 and 10. LMK if I should create a different issue.

├── link -> /tmp/target
│   ├── dist
│   │   └── index.js
│   └── index.js
~/test-npm-pack '["link"]' '["link/dist"]' '["link/*"]' | column -t -s '   ' -o ' | '
8 (["link"])       | 10 (["link"])      | both
                   |                    | package.json

Neither 8 or 10 will recursively include files from a symlink that matches the pattern.

8 (["link/dist"])  | 10 (["link/dist"]) | both
link/dist/index.js |                    |
                   |                    | package.json

npm 8 will match files inside a link, as long as the pattern reaches past the link, but npm 10 won't.

8 (["link/*"])     | 10 (["link/*"])    | both
link/dist/index.js |                    |
link/index.js      |                    |
                   |                    | package.json

npm 8 will also work with patterns, I think this is the same behavior as before.

8 (["link/dist/index.js"]) | 10 (["link/dist/index.js"]) | both
link/dist/index.js         |                             |
                           |                             | package.json

npm 10 won't match a file under a symlink even when the exact path is given.

The behavior is the same with an internal link, like link -> ./dist-es. I'm less sure about what the right behavior is here.

everett1992 avatar May 14 '24 23:05 everett1992

A co worker found an odd behavior with directories named cvs. In this case I can't explain how either npm 8 or 10 is behaving. I understand the CVS is ignored by default, but it's not on the list of files that can't be included (docs)

  • npm 10 will include the cvs directory with "cvs" patterns, but 8 won't.
  • Neither version will include dist/cvs directory with the dist/cvs pattern.
  • npm 8 will include dist/cvs/index.js with dist/cvs/index.js pattern but npm 10 won't.
  • Everything seems case insensitive and CVS will match cvs and vice versa.
.
├── cvs
│   └── index.js
├── CVS
│   └── index.js
├── dist
│   ├── cvs
│   │   └── index.js
│   └── CVS
│       └── index.js
└── package.json
8 (["cvs"])               | 10 (["cvs"])               | both
                          | cvs/index.js               |
                          | CVS/index.js               |
                          |                            | package.json
8 (["CVS"])               | 10 (["CVS"])               | both
                          | cvs/index.js               |
                          | CVS/index.js               |
                          |                            | package.json
8 (["dist/cvs"])          | 10 (["dist/cvs"])          | both
                          |                            | package.json
8 (["dist/CVS"])          | 10 (["dist/CVS"])          | both
                          |                            | package.json
8 (["cvs/index.js"])      | 10 (["cvs/index.js"])      | both
                          |                            | cvs/index.js
                          |                            | CVS/index.js
                          |                            | package.json
8 (["CVS/index.js"])      | 10 (["CVS/index.js"])      | both
                          |                            | cvs/index.js
                          |                            | CVS/index.js
                          |                            | package.json
8 (["dist/cvs/index.js"]) | 10 (["dist/cvs/index.js"]) | both
dist/cvs/index.js         |                            |
dist/CVS/index.js         |                            |
                          |                            | package.json
8 (["dist/CVS/index.js"]) | 10 (["dist/CVS/index.js"]) | both
dist/cvs/index.js         |                            |
dist/CVS/index.js         |                            |
                          |                            | package.json

npm-debug.log is also inconsistent, and differently than cvs.

.
├── dist
│   └── npm-debug.log
│       └── index.js
├── npm-debug.log
│   ├── index.js
│   └── npm-debug.log
└── package.json
8 (["npm-debug.log"])       | 10 (["npm-debug.log"])      | both
                            |                             | npm-debug.log/index.js
npm-debug.log/npm-debug.log |                             |
                            |                             | package.json
8 (["dist/npm-debug.log"])  | 10 (["dist/npm-debug.log"]) | both
dist/npm-debug.log/index.js |                             |
                            |                             | package.json
8 (["**/npm-debug.log"])    | 10 (["**/npm-debug.log"])   | both
dist/npm-debug.log/index.js |                             |
npm-debug.log/index.js      |                             |
npm-debug.log/npm-debug.log |                             |
                            |                             | package.json

everett1992 avatar May 17 '24 18:05 everett1992

https://github.com/npm/npm-packlist/blob/cb4a823cd42d50475a8e1e7582b95b15766f5ca2/lib/index.js#L25

wraithgar avatar May 17 '24 19:05 wraithgar

It's on the default ignore list, but it's not on the "can't include" list.

Why doesn't **/CVS/** include dist/CVS/index.js in npm 10? Why doesn't dist/CVS include dist/CVS/index.js in npm 10?

.
├── CVS
│   └── index.js
├── dist
│   └── CVS
│       └── index.js
└── package.json
8 (["**/CVS/**"]) | 10 (["**/CVS/**"]) | both
                  |                    | CVS/index.js
dist/CVS/index.js |                    |
                  |                    | package.json
8 (["CVS"])       | 10 (["CVS"])       | both
                  | CVS/index.js       |
                  |                    | package.json
8 (["dist/CVS"])  | 10 (["dist/CVS"])  | both
                  |                    | package.json

everett1992 avatar May 17 '24 19:05 everett1992