puremagic icon indicating copy to clipboard operation
puremagic copied to clipboard

imghdr matches in PureMagic?

Open NebularNerd opened this issue 9 months ago • 4 comments

I've re-titled this as it got a bit off-track from the original purpose of the issue.

@cclauss wanted to know if PureMagic could provide a 1:1 replacement of imghdr, to avoid crosstalk I've moved the original content to #69

NebularNerd avatar May 11 '24 13:05 NebularNerd

Did imghdr have tests we should attempt to run here?

cclauss avatar May 11 '24 14:05 cclauss

Looking at the imghdr source it runs a very primitive test, and can only match 0x01da / b'\001\332'.

https://github.com/python/cpython/blob/af8db2b6817e1ae25cbee98e67d1f4bac9d6af9c/Lib/imghdr.py#L73-L76

The current magic in PureMagic would not pick this up as it currently has a longer variant specific match, adding a short match like above is fine, but it can lead to low confidence scores and potential clashes if another file uses the same header. The best approach for good match scores and a definitive solid match is to use the longest header and multi-match possible.

I'll likely run up a PR using the same conventions as I did for PCX, this issue is more about a thought about naming conventions and potential better presentation of the results.

Also, as an aside, SGI would make a great example of a rule-based detection as the header is quite complicated with about 6 unique fixed fingerprints.

NebularNerd avatar May 11 '24 14:05 NebularNerd

I meant all the tests in that file for all the format, not just rgb.

cclauss avatar May 11 '24 14:05 cclauss

Looking at the source again, a quick run-down:

  • .jpg ✅
  • .png ✅
  • .gif ✅
  • .tif/.tiff ✅
  • .rgb ❌ As discussed above
  • .pbm ✅
  • .pgm ✅
  • Sun Raster ✅
  • X Bitmap ❌
  • .bmp ✅
  • .webp ✅
  • OpenEXR ✅

X Bitmap looks to not be present, SGI we know needs some love, the other formats are supported to at least the same degree the imghdr uses (there are a couple that need tidying and testing). If I do a PR then it would include the missing formats, plus improvements to the existing ones where I think PureMagic can now do a better job.

NebularNerd avatar May 11 '24 14:05 NebularNerd

I'll close this as my PR #75 was merged but it did not close this.

NebularNerd avatar Jul 03 '24 14:07 NebularNerd