JImageHash icon indicating copy to clipboard operation
JImageHash copied to clipboard

Another Good Image Hash Method

Open mrlimbic opened this issue 3 years ago • 4 comments

Just thought you might also like this image hash method paper.

"AN IMAGE SIGNATURE FOR ANY KIND OF IMAGE" by H. Chi Wong, Marshall Bern, and David Goldberg

http://www.cs.cmu.edu/~hcwong/Pdfs/icip02.ps

mrlimbic avatar May 02 '22 13:05 mrlimbic

The file should be a PDF according to the website, but it is a “.ps” file. Are you sure it's clean? In a hex editor, you can see that it contains some kind of strange commands/codes.

BigPanda97 avatar May 02 '22 14:05 BigPanda97

It's in a folder with other PDFs but yes that particular file is postscript. On Mac OS it opens converted to a PDF in preview. I think it's legit as it's an academic site.

I came across it because it was referenced in a python image matching tool as "goldberg" hash. Apparently it works very well.

https://github.com/ProvenanceLabs/image-match/blob/master/image_match/goldberg.py

The reason I am interested is for matching frames in video. Most of the image hash methods don't work very well in my tests so far because obviously one video frame is often so similar to the previous & next that matches happen much too frequently.

One of the things that algorithm does is add a moire filter before hashing. That is good if you need to distinguish very similar images like I want. A slight movement shouldn't match as much.

mrlimbic avatar May 02 '22 15:05 mrlimbic

Have you already seen https://github.com/facebook/ThreatExchange? It was developed by Facebook and contains an image hashing algorithm (PDQ), a video hashing algorithm (TMK) and they implement another way of hashing videos soon. (vPDQ)

Explained in detail here: https://github.com/facebook/ThreatExchange/blob/main/hashing/hashing.pdf

BigPanda97 avatar May 02 '22 18:05 BigPanda97

I'll test out how noisy PDQ is compared to other hashes. My simple "hash noise" test is just to compare hashes of current frame to previous frame.

You can see how noisy perceptive hash from this JImageHash library is. The genuinely very different frames stand out (where a cut to a different shot is) but scene detection is not enough for me. I need much less noise from similar but not the exact same frame.

https://drive.google.com/file/d/1TBwNw1Ymh_iRSTsRAOBz9BSv8VnL8BWm/view?usp=sharing

mrlimbic avatar May 02 '22 19:05 mrlimbic