Mime-Detective icon indicating copy to clipboard operation
Mime-Detective copied to clipboard

Undetected PDF that can still be opened in pdf reader.

Open simmisj opened this issue 5 years ago • 0 comments

Hi. Recently I received a pdf document that was not corrupt and could be opened in a pdf reader but was not detected as a pdf by Mime-Detective. The pdf standard says that a pdf document should start with the magic number and a version number. See 'Technical overview - File structure' here: https://en.wikipedia.org/wiki/PDF But the document that I received started with a new line and this òÀ followed by the magic number and version number. You can replicate this by taking any working pdf document and adding it to the beginning of the file in a text editor. Setting the pdf type offset to 4 makes Mime-Detective detect it as a pdf since it skips the added gibberish. The issue here is, since pdf readers can safely open such documents, shouldn't Mime-Detective detect it as a valid pdf document? The problem seems to be in the GetFileMatchingCount method in MimeTypes class. It expects the header to be the first thing it sees and breaks out immediately. Cheers!

simmisj avatar Dec 12 '19 14:12 simmisj