magika icon indicating copy to clipboard operation
magika copied to clipboard

Incorrectly identifies some binary files as code or text

Open noway opened this issue 1 year ago • 3 comments

I've generated 2435 files that are classified as text and code, although under manual inspection the files are binary.

Here is the repo with the generated files: https://github.com/noway/magika-binary-misclassification-as-text

I've used fuzzing. More info here.

Cheers!

noway avatar Feb 18 '24 05:02 noway

.bin file is not listed in supported files

devilkadabra69 avatar Feb 19 '24 08:02 devilkadabra69

@devilkadabra69 magika should classify those files as Unknown binary data (unknown). It's in the table you've provided, see index 100.

noway avatar Feb 19 '24 23:02 noway

Yep, random stream of bytes should fall under unknown. Thanks @noway for reporting and for the fuzzing, interesting stuff. Marked this issue with proper tags so that we remember about it when we plan for next iterations. Thanks!

reyammer avatar Feb 20 '24 10:02 reyammer