file-type
file-type copied to clipboard
Allow disabling unsafe types?
There's a few file types that are marked as "unsafe" - and looking at the detection signatures I assume it's because the check is somewhat simple/naive. I have a gLTF file that is being detected as an ico, for instance.
Would it be possible to either:
a) Be able to disable the unsafe types (fromBuffer(buf, {allowUnsafe: false}) or similar)
b) Return an unsafe: true in the return value for the unsafe types
c) Expose an array or similar that declares the unsafe types so we can exclude them on the consumer side
The issue with c) is that some of the unsafe file types does have more "safe" routes (like mpg).
I did not immediately understood what you meant with unsafe signatures, but I get it now, it are signatures beyond this point:
https://github.com/sindresorhus/file-type/blob/c037ba7ed901bd5efe57bbba707b607035265eaf/core.js#L1179
these are much more likely to be false positives.
I did not immediately understood what you meant with unsafe signatures, but I get it now, it are signatures beyond this point:
https://github.com/sindresorhus/file-type/blob/c037ba7ed901bd5efe57bbba707b607035265eaf/core.js#L1179
these are much more likely to be false positives.
Yep, apologies for the vagueness. I've updated the original issue with a link.
Would you be willing to accept a PR? Any preferred approach?
I had a look to the MPEG-1 detection based on your feedback.
https://github.com/sindresorhus/file-type/blob/c037ba7ed901bd5efe57bbba707b607035265eaf/core.js#L1181-L1189
some of the unsafe file types does have more "safe" routes (like mpg).
I don't think there is an alternative detection for that present in the code.
I actually tlooked into making it more safe. From the non official specs it looks the current filter is actually already to narrow (at least on the bytes it is testing on). It's currently to specific on the stream-id.
a) Be able to disable the unsafe types (fromBuffer(buf, {allowUnsafe: false}) or similar)
I think it makes sense to optionaly exclude likely false positives (the so called unsafe signatures). I would prefere includeUnsafe.
b) Return an unsafe: true in the return value for the unsafe types
Risk on false postives also exist for signatures and for some more likely to occure then others. But it makes sense as well to indicate that the likelyhood of a false postive is significantly higher.
c) Expose an array or similar that declares the unsafe types so we can exclude them on the consumer side
I am not a big fan of exposing potential outcomes as part of the API to begin with.
Additonally:
- We could try these file type (signature) recognition
- We could move the unsafe signatures furher down, so give more presedence to safe longer signatures.
Related to unsafe types: https://github.com/Borewit/peek-readable/issues/356#issuecomment-902900156
pointing out is is also guessing the file-type here:
https://github.com/sindresorhus/file-type/blob/c037ba7ed901bd5efe57bbba707b607035265eaf/core.js#L176-L180