[BUG] Shebang is not valid for project dir and npm pack
Is there an existing issue for this?
- [X] I have searched the existing issues
This issue exists in the latest npm version
- [X] I am using the latest npm
Current Behavior
mkdir '/tmp/#'
cd '/tmp/#'
npm pack
Could not read package.json: Error: ENOENT: no such file or directory, open '/tmp/package.json'
It doesn't matter how long dir name will be, if it contains shebang than npm pack tries to read package.json from parent dir.
Expected Behavior
All printable ascii symbols expect / should be valid for project dir.
Steps To Reproduce
No response
Environment
- npm: 10.8.1
- Node.js: v20.16.0
- OS Name: Ubuntu 24.04.1 LTS
PS also it is not possible to work with nested dirs containing shebang:
cd '/tmp/#/b'
npm pack
Could not read package.json: Error: ENOENT: no such file or directory, open '/tmp/package.json'
'#' is valid name for dir. If you have a special list of invalid symbols for dir names please publish it. Thank you.
I've found that # and ? is not valid for any nested dir name. But when I've excluded it I've found another issue.
cd ' !"$%&'\''()*+,-.0123456789:;<=>@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~'
npm pack
4 silly config load:file:/tmp/ !"$%&'()*+,-.0123456789:;<=>@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~/.npmrc
5 silly config load:file:/home/puchuu/.npmrc
6 silly config load:file:/home/puchuu/.local/share/fnm/node-versions/v20.16.0/installation/etc/npmrc
7 verbose title npm pack
8 verbose argv "pack"
9 verbose logfile logs-max:10 dir:/home/puchuu/.npm/_logs/2024-09-12T12_18_03_676Z-
10 verbose logfile /home/puchuu/.npm/_logs/2024-09-12T12_18_03_676Z-debug-0.log
11 silly logfile start cleaning logs, removing 1 files
12 verbose stack URIError: URI malformed
12 verbose stack at decodeURIComponent (<anonymous>)
12 verbose stack at fromFile (/home/puchuu/.local/share/fnm/node-versions/v20.16.0/installation/lib/node_modules/npm/node_modules/npm-package-arg/lib/npa.js:279:22)
It looks like npm-package-arg is broken. Developers of that package is trying to work with path names using encode and decode URI methods. Of course, it is not possible.
Ok, so we can see that npm is actively misusing URI: regular path is converted to URI without any reasons or validations.
Well, I need a reliable way to detect what is valid alphabet for path and path component. Let's do it:
const printableAlphabet = new Array(95)
.fill(undefined)
.map((_value, index) => String.fromCharCode(32 + index))
.join('');
const pathAlphabet = printableAlphabet.split('').filter((symbol) => encodeURI(symbol) === symbol).join('');
const pathComponentAlphabet = printableAlphabet.split('').filter((symbol) => encodeURIComponent(symbol) === symbol).join('');
Result:
pathAlphabet: !#$&'()*+,-./0123456789:;=?@ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz~
pathComponentAlphabet: !'()*-.0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz~
The complete workaround looks as follows:
export const PRINTABLE_ALPHABET = new Array(95)
.fill(undefined)
.map((_symbol, index) => String.fromCharCode(32 + index))
.join('');
export const PATH_ALPHABET = PRINTABLE_ALPHABET;
export const PATH_COMPONENT_ALPHABET = PATH_ALPHABET.replaceAll('/', '');
export const URI_ALPHABET = PATH_ALPHABET.split('')
.filter((symbol) => encodeURI(symbol) === symbol)
.join('');
export const URI_COMPONENT_ALPHABET = PATH_COMPONENT_ALPHABET.split('')
.filter((symbol) => encodeURIComponent(symbol) === symbol)
.join('');
// Many programs such as NPM may misuse URI for working with regular path and path components.
// So it is recommended to use URI alphabets for path and path components.
export const RECOMMENDED_PATH_ALPHABET = URI_ALPHABET;
export const RECOMMENDED_PATH_COMPONENT_ALPHABET = URI_COMPONENT_ALPHABET;
I hope it will help to workaround this issue.
Hey @andrew-aladjev 👋
- I don't believe this is an issue with pack / naming folders
- This is happening because you don't have a package.json
- There is no "not valid project dir"
- If you run
npm init -ythe folder name needs to be a valid package name, because the package.name is set to the folder (thats the only important time a folder name matters) - I tested with a folder named "#/index.js" within the pack as long as you have a valid package.json the pack should work
- feel free to repopen
I've tested and it fails.
mkdir 'a#'
cd 'a#'
npm init
cat package.json
{
"name": "test",
"version": "1.0.0",
"main": "index.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"author": "",
"license": "ISC",
"description": ""
}
npm pack
npm error Invalid package, must have name and version
cd ..
mv 'a#' ab
cd ab
npm pack
test-1.0.0.tgz
Please reopen this issue. Maybe you didn't catch it. I mean shebang is not valid for project dir. Project dir is a storage dir for your current project, not some folder inside project.
Ah I see whats happening now, the # is being parsed as a url fragment here https://github.com/npm/npm-package-arg/blob/main/lib/npa.js#L247
This code shouldn't be parsing paths as urls.
I believe we support file paths as url to abide by a web standards that fit better with the inner workings of couch or s3 to consistently store and retrieve packages.
The "file" URI Scheme specification defined in RFC 8089 doesn't list out all characters not allowed in the path
https://datatracker.ietf.org/doc/html/rfc8089#appendix-B
File systems typically assign an operational meaning to special characters, such as the "/", "", ":", "[", and "]" characters, and to special device names like ".", "..", "...", "aux", "lpt", etc. In some cases, merely testing for the existence of such a name will cause the operating system to pause or invoke unrelated system calls, leading to significant security concerns regarding denial of service and unintended data transfer. It would not be possible for this specification to list all such significant characters and device names. Implementers should research the reserved names and characters for the types of storage devices that may be attached to their application and restrict the use of data obtained from URI components accordingly.
You can check if a path is valid with a simple function like this:
const checkPath = path => new URL('file:', `file://${path}`).pathname === path
I'd be open to having a better error message here as opposed to jumping up one directory looking for a package.json and ignoring the invalid directory name with the #. But I'm afraid the underlying behavior is apart of the spec.
The URL spec doesn't have any bearing on file paths, though - file paths are their own (better) thing, and npm works with files.
Hello everyone. Honestly speaking, URL or URI has nothing to do with file system pathes.
File system section in RFC 8089 is related to remote file systems, for example: ftp://dir/file, smb://dir/file.txt, etc.
I think, it is especially true for situation we have today: NPM should not use URL or URI for handling arguments.
This is fixed by PR https://github.com/npm/cli/pull/8115