tar containing pdf is detected as pdf
I know this library is unmaintained, but opening this for a future maintainer :)
this file: test.tar.zip (zipped to prevent github complaining)
file --mime-type
application/x-tar
tmagic:
application/pdf
@phiresky This is intentional. Quote from README:
Unlike the typical approach that libmagic and file(1) uses, this loads all the file types in a tree based on subclasses. (EX: application/vnd.openxmlformats-officedocument.wordprocessingml.document (MS Office 2007) subclasses application/zip which subclasses application/octet-stream) Then, instead of checking the file against every file type, it can traverse down the tree and only check the file types that make sense to check. (After all, the fastest check is the check that never gets run.)
tbh I don't see how that explains misdetection? Why does traversing a tree explain a wrong detection? Even if the answer is ambiguous, I don't see why it can't either output the more likely one or all possibilities.
The readme also says tree_magic is designed to be more efficient and to have less false positives compared to the old approach used by libmagic