file-type icon indicating copy to clipboard operation
file-type copied to clipboard

Allow disabling unsafe types?

Open rexxars opened this issue 4 years ago • 4 comments

There's a few file types that are marked as "unsafe" - and looking at the detection signatures I assume it's because the check is somewhat simple/naive. I have a gLTF file that is being detected as an ico, for instance.

Would it be possible to either:

a) Be able to disable the unsafe types (fromBuffer(buf, {allowUnsafe: false}) or similar) b) Return an unsafe: true in the return value for the unsafe types c) Expose an array or similar that declares the unsafe types so we can exclude them on the consumer side

The issue with c) is that some of the unsafe file types does have more "safe" routes (like mpg).

rexxars avatar Aug 16 '21 13:08 rexxars

I did not immediately understood what you meant with unsafe signatures, but I get it now, it are signatures beyond this point:

https://github.com/sindresorhus/file-type/blob/c037ba7ed901bd5efe57bbba707b607035265eaf/core.js#L1179

these are much more likely to be false positives.

Borewit avatar Aug 16 '21 15:08 Borewit

I did not immediately understood what you meant with unsafe signatures, but I get it now, it are signatures beyond this point:

https://github.com/sindresorhus/file-type/blob/c037ba7ed901bd5efe57bbba707b607035265eaf/core.js#L1179

these are much more likely to be false positives.

Yep, apologies for the vagueness. I've updated the original issue with a link.

Would you be willing to accept a PR? Any preferred approach?

rexxars avatar Aug 17 '21 09:08 rexxars

I had a look to the MPEG-1 detection based on your feedback.

https://github.com/sindresorhus/file-type/blob/c037ba7ed901bd5efe57bbba707b607035265eaf/core.js#L1181-L1189

some of the unsafe file types does have more "safe" routes (like mpg).

I don't think there is an alternative detection for that present in the code.

I actually tlooked into making it more safe. From the non official specs it looks the current filter is actually already to narrow (at least on the bytes it is testing on). It's currently to specific on the stream-id.

a) Be able to disable the unsafe types (fromBuffer(buf, {allowUnsafe: false}) or similar)

I think it makes sense to optionaly exclude likely false positives (the so called unsafe signatures). I would prefere includeUnsafe.

b) Return an unsafe: true in the return value for the unsafe types

Risk on false postives also exist for signatures and for some more likely to occure then others. But it makes sense as well to indicate that the likelyhood of a false postive is significantly higher.

c) Expose an array or similar that declares the unsafe types so we can exclude them on the consumer side

I am not a big fan of exposing potential outcomes as part of the API to begin with.

Additonally:

  • We could try these file type (signature) recognition
  • We could move the unsafe signatures furher down, so give more presedence to safe longer signatures.

Borewit avatar Aug 17 '21 10:08 Borewit

Related to unsafe types: https://github.com/Borewit/peek-readable/issues/356#issuecomment-902900156

pointing out is is also guessing the file-type here:

https://github.com/sindresorhus/file-type/blob/c037ba7ed901bd5efe57bbba707b607035265eaf/core.js#L176-L180

Borewit avatar Aug 20 '21 19:08 Borewit