cli icon indicating copy to clipboard operation
cli copied to clipboard

[BUG] Shebang is not valid for project dir and npm pack

Open andrew-aladjev opened this issue 1 year ago • 9 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

This issue exists in the latest npm version

  • [X] I am using the latest npm

Current Behavior

mkdir '/tmp/#'
cd '/tmp/#'
npm pack

Could not read package.json: Error: ENOENT: no such file or directory, open '/tmp/package.json'

It doesn't matter how long dir name will be, if it contains shebang than npm pack tries to read package.json from parent dir.

Expected Behavior

All printable ascii symbols expect / should be valid for project dir.

Steps To Reproduce

No response

Environment

  • npm: 10.8.1
  • Node.js: v20.16.0
  • OS Name: Ubuntu 24.04.1 LTS

andrew-aladjev avatar Sep 12 '24 11:09 andrew-aladjev

PS also it is not possible to work with nested dirs containing shebang:

cd '/tmp/#/b'
npm pack

Could not read package.json: Error: ENOENT: no such file or directory, open '/tmp/package.json'

'#' is valid name for dir. If you have a special list of invalid symbols for dir names please publish it. Thank you.

andrew-aladjev avatar Sep 12 '24 11:09 andrew-aladjev

I've found that # and ? is not valid for any nested dir name. But when I've excluded it I've found another issue.

cd ' !"$%&'\''()*+,-.0123456789:;<=>@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~'
npm pack
4 silly config load:file:/tmp/ !"$%&'()*+,-.0123456789:;<=>@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~/.npmrc
5 silly config load:file:/home/puchuu/.npmrc
6 silly config load:file:/home/puchuu/.local/share/fnm/node-versions/v20.16.0/installation/etc/npmrc
7 verbose title npm pack
8 verbose argv "pack"
9 verbose logfile logs-max:10 dir:/home/puchuu/.npm/_logs/2024-09-12T12_18_03_676Z-
10 verbose logfile /home/puchuu/.npm/_logs/2024-09-12T12_18_03_676Z-debug-0.log
11 silly logfile start cleaning logs, removing 1 files
12 verbose stack URIError: URI malformed
12 verbose stack     at decodeURIComponent (<anonymous>)
12 verbose stack     at fromFile (/home/puchuu/.local/share/fnm/node-versions/v20.16.0/installation/lib/node_modules/npm/node_modules/npm-package-arg/lib/npa.js:279:22)

It looks like npm-package-arg is broken. Developers of that package is trying to work with path names using encode and decode URI methods. Of course, it is not possible.

andrew-aladjev avatar Sep 12 '24 12:09 andrew-aladjev

Ok, so we can see that npm is actively misusing URI: regular path is converted to URI without any reasons or validations.

Well, I need a reliable way to detect what is valid alphabet for path and path component. Let's do it:

const printableAlphabet = new Array(95)
  .fill(undefined)
  .map((_value, index) => String.fromCharCode(32 + index))
  .join('');

const pathAlphabet = printableAlphabet.split('').filter((symbol) => encodeURI(symbol) === symbol).join('');
const pathComponentAlphabet = printableAlphabet.split('').filter((symbol) => encodeURIComponent(symbol) === symbol).join('');

Result:

pathAlphabet: !#$&'()*+,-./0123456789:;=?@ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz~
pathComponentAlphabet: !'()*-.0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz~

The complete workaround looks as follows:

export const PRINTABLE_ALPHABET = new Array(95)
  .fill(undefined)
  .map((_symbol, index) => String.fromCharCode(32 + index))
  .join('');

export const PATH_ALPHABET = PRINTABLE_ALPHABET;
export const PATH_COMPONENT_ALPHABET = PATH_ALPHABET.replaceAll('/', '');

export const URI_ALPHABET = PATH_ALPHABET.split('')
  .filter((symbol) => encodeURI(symbol) === symbol)
  .join('');
export const URI_COMPONENT_ALPHABET = PATH_COMPONENT_ALPHABET.split('')
  .filter((symbol) => encodeURIComponent(symbol) === symbol)
  .join('');

// Many programs such as NPM may misuse URI for working with regular path and path components.
// So it is recommended to use URI alphabets for path and path components.

export const RECOMMENDED_PATH_ALPHABET = URI_ALPHABET;
export const RECOMMENDED_PATH_COMPONENT_ALPHABET = URI_COMPONENT_ALPHABET;

I hope it will help to workaround this issue.

andrew-aladjev avatar Sep 12 '24 13:09 andrew-aladjev

Hey @andrew-aladjev 👋

  • I don't believe this is an issue with pack / naming folders
  • This is happening because you don't have a package.json
  • There is no "not valid project dir"
  • If you run npm init -y the folder name needs to be a valid package name, because the package.name is set to the folder (thats the only important time a folder name matters)
  • I tested with a folder named "#/index.js" within the pack as long as you have a valid package.json the pack should work
  • feel free to repopen

reggi avatar Sep 20 '24 15:09 reggi

I've tested and it fails.

mkdir 'a#'
cd 'a#'
npm init
cat package.json
{
  "name": "test",
  "version": "1.0.0",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "author": "",
  "license": "ISC",
  "description": ""
}
npm pack
npm error Invalid package, must have name and version
cd ..
mv 'a#' ab
cd ab
npm pack
test-1.0.0.tgz

Please reopen this issue. Maybe you didn't catch it. I mean shebang is not valid for project dir. Project dir is a storage dir for your current project, not some folder inside project.

andrew-aladjev avatar Sep 20 '24 16:09 andrew-aladjev

Ah I see whats happening now, the # is being parsed as a url fragment here https://github.com/npm/npm-package-arg/blob/main/lib/npa.js#L247

This code shouldn't be parsing paths as urls.

reggi avatar Sep 20 '24 19:09 reggi

I believe we support file paths as url to abide by a web standards that fit better with the inner workings of couch or s3 to consistently store and retrieve packages.

The "file" URI Scheme specification defined in RFC 8089 doesn't list out all characters not allowed in the path

https://datatracker.ietf.org/doc/html/rfc8089#appendix-B

File systems typically assign an operational meaning to special characters, such as the "/", "", ":", "[", and "]" characters, and to special device names like ".", "..", "...", "aux", "lpt", etc. In some cases, merely testing for the existence of such a name will cause the operating system to pause or invoke unrelated system calls, leading to significant security concerns regarding denial of service and unintended data transfer. It would not be possible for this specification to list all such significant characters and device names. Implementers should research the reserved names and characters for the types of storage devices that may be attached to their application and restrict the use of data obtained from URI components accordingly.

You can check if a path is valid with a simple function like this:

const checkPath = path => new URL('file:', `file://${path}`).pathname === path

I'd be open to having a better error message here as opposed to jumping up one directory looking for a package.json and ignoring the invalid directory name with the #. But I'm afraid the underlying behavior is apart of the spec.

reggi avatar Sep 23 '24 14:09 reggi

The URL spec doesn't have any bearing on file paths, though - file paths are their own (better) thing, and npm works with files.

ljharb avatar Sep 23 '24 15:09 ljharb

Hello everyone. Honestly speaking, URL or URI has nothing to do with file system pathes.

File system section in RFC 8089 is related to remote file systems, for example: ftp://dir/file, smb://dir/file.txt, etc.

I think, it is especially true for situation we have today: NPM should not use URL or URI for handling arguments.

andrew-aladjev avatar Sep 23 '24 16:09 andrew-aladjev

This is fixed by PR https://github.com/npm/cli/pull/8115

milaninfy avatar Feb 19 '25 17:02 milaninfy