node icon indicating copy to clipboard operation
node copied to clipboard

add `path.mimeType()` and `path.charset()` to `node:path`

Open aralroca opened this issue 1 year ago • 2 comments
trafficstars

What is the problem this feature will solve?

Adding path.mimeType() and path.charset will solve the dependency on an external library, mime-types, to get the content type of a file based on its extension. This can be done with Node.js itself, just as Bun.js does with Bun.file(file).type.

What is the feature you are proposing to solve the problem?

I am proposing to add two new methods to the path module:

  • mimeType - returns the MIME type of a file based on its extension.
  • charset - returns the charset of a MIME type.
path.mimeType('file.js'); // returns 'application/javascript'
// or:
path.mimeType(path.extname('file.js')); // returns 'application/javascript'

And:

path.charset('file.js'); // returns 'utf-8'
// or:
path.charset(path.mimeType('file.js')); // returns 'utf-8'

What alternatives have you considered?

Alternatively, instead of being in node:path, it could be in fs.stat, but I think that in the end it can be extracted from the path without having to parse the file, besides it is better to use functions for this and the work is done only if they are executed individually.

CC: @vdeturckheim

aralroca avatar Aug 27 '24 18:08 aralroca

For path.mimeType(...) a couple of questions...

  1. Would you expect the actual path to be resolved and for the mime type to be sniffed from the contents or based entirely on the file extension?
  2. Which master list of mime types to file extensions do you propose we use and what is the source of that? How is it maintained.

For path.charset('...') ... the only way to accomplish this is to actually try reading the file. I don't believe that belongs in the path module but possibly in fs. Even then it can be a bit problematic to reliably determine the encoding of a file. Would it be based entirely on the presence of a BOM? Would it be limited to just the charset encodings we already know of (e.g. utf8, utf16le, etc)? Given cases where the charset is ambiguous (e.g. utf8 without a bom that only contains ascii characters could be utf8, ascii, or latin1), what would be the expected result?

jasnell avatar Aug 27 '24 18:08 jasnell

Personally, I'm -1 on adding this to core.

Mimetypes can be easily done in userland using a Map<extension, mime-type>

Charsets require reading the file, which is not something the path module does, and I don't think it is something the path module should.

avivkeller avatar Aug 27 '24 18:08 avivkeller

  1. Would you expect the actual path to be resolved and for the mime type to be sniffed from the contents or based entirely on the file extension?
  2. Which master list of mime types to file extensions do you propose we use and what is the source of that? How is it maintained.

For path.charset('...') ... the only way to accomplish this is to actually try reading the file. I don't believe that belongs in the path module but possibly in fs. Even then it can be a bit problematic to reliably determine the encoding of a file. Would it be based entirely on the presence of a BOM? Would it be limited to just the charset encodings we already know of (e.g. utf8, utf16le, etc)? Given cases where the charset is ambiguous (e.g. utf8 without a bom that only contains ascii characters could be utf8, ascii, or latin1), what would be the expected result?

@jasnell It is not clear to me that it is necessary to look at the contents of the file. The chatset can be extracted from the mimeType, and the mimeType from the file extension. Nowadays the libraries that support this make use of this JSON reference: https://github.com/jshttp/mime-db/blob/master/db.json

Personally, I'm -1 on adding this to core.

Mimetypes can be easily done in userland using a Map<extension, mime-type>

Charsets require reading the file, which is not something the path module does, and I don't think it is something the path module should.

@RedYetiDev as I commented the current libraries to do this extract the charset depending on the mimeType without looking the internal content. This is very useful for frameworks to serve dynamic assets and put the Content-Type in a fast way. If the file has to be read it would lose efficiency if you want to serve the file in streaming.

And about the mapping, yes, can be done via a Map, but this Map is huge, and it doesn't make sense to depend on a library to maintain this JSON, it is better that it is part of the runtime itself IMO, besides that in the end it is adding dependencies for basic things.

aralroca avatar Aug 30 '24 14:08 aralroca

(CC @nodejs/path)


(While I am not a core collaborator) I still strongly disagree with the addition of this API into the core of Node.js. While I do greatly appreciate this issue being opened, I have some major concerns with this API. Here are a few of them:

1. Maintainability

If Node.js were to implement this feature, the project would need to take on the significant responsibility of maintaining a constantly evolving and extensive map from file extensions to character sets and MIME types. Additionally, the project would have to handle edge cases, conflicts, and potential ambiguities that may arise due to overlapping file extensions or varying interpretations of MIME type specifications across different platforms and software.

2. Redundancy with Existing Ecosystem

The Node.js ecosystem already includes well-established modules like mime and mime-types that effectively manage MIME type and charset detection. These modules are actively maintained by the community and provide flexibility and extensibility beyond what a core implementation could offer. Introducing path.mimeType() and path.charset() would duplicate functionality that is already readily available and well-supported, leading to redundancy and possibly fragmenting the ecosystem.

3. Performance Concerns

Implementing and maintaining a reliable MIME type and charset detection mechanism within the Node.js core could introduce performance overhead, particularly in cases where the detection process is complex or requires frequent updates to a large mapping table. Even if the performance impact is minimal, the added complexity in the core could have unintended side effects on the overall performance of the Node.js runtime.

avivkeller avatar Aug 30 '24 14:08 avivkeller

There has been no activity on this feature request for 5 months. To help maintain relevant open issues, please add the https://github.com/nodejs/node/labels/never-stale label or close this issue if it should be closed. If not, the issue will be automatically closed 6 months after the last non-automated comment. For more information on how the project manages feature requests, please consult the feature request management document.

github-actions[bot] avatar Feb 27 '25 01:02 github-actions[bot]

By the way, if it helps in the future when this conversation comes up again, mime-db doesn't strictly follow semver since the records can change (see https://github.com/jshttp/mime-db/issues/331, https://github.com/jshttp/mime-db/pull/330) and break certain applications.

Just my two cents, but even though other runtimes have it implemented, I don't think Node.js should implement it.

bjohansebas avatar Mar 06 '25 01:03 bjohansebas

The only thing that makes me think mime-db should be internalized into node's api is that you can't actually write an http server that works correctly with modern browsers without adding a content-type header, and this will require either having our own db, or using npm. Other than that, a core principle of node.http seems to have been, for at least a while now, to be a kitchen sink. A mime db is one of the most glaring omissions, and the only one I'm aware of.

sdegutis avatar Apr 21 '25 16:04 sdegutis

There has been no activity on this feature request for 5 months. To help maintain relevant open issues, please add the https://github.com/nodejs/node/labels/never-stale label or close this issue if it should be closed. If not, the issue will be automatically closed 6 months after the last non-automated comment. For more information on how the project manages feature requests, please consult the feature request management document.

github-actions[bot] avatar Oct 19 '25 01:10 github-actions[bot]

There has been no activity on this feature request and it is being closed. If you feel closing this issue is not the right thing to do, please leave a comment.

For more information on how the project manages feature requests, please consult the feature request management document.

github-actions[bot] avatar Nov 19 '25 01:11 github-actions[bot]