scala-phash
scala-phash copied to clipboard
Image comparison by hash codes
Scala pHash
Scala fork of pHash library. This library identifies whether images are similar. You can try it at demo page.
Original pHash uses CImg library for image processing but I could not find CImg for jvm. Therefore I use java.awt
and self-made functions for image processing. Consequently, results of my library is different from original phash.
How to use
My library implements three Perceptual Hashing algorithms: Radial Hash, DCT hash and Marr hash. More info about it.
sbt dependencies
libraryDependencies += "com.github.poslegm" %% "scala-phash" % "1.2.2"
API
There is three functions for each hashing algorithm. Let's consider them by example of DCT hash:
-
def dctHash(image: BufferedImage): Either[Throwable, DCTHash]
― compute image's hash; -
def unsafeDctHash(image: BufferedImage): DCTHash
― compute image's hash unsafely (danger of exception); -
def dctHashDistance(hash1: DCTHash, hash2: DCTHash): Long
― compare hashes of two images.
Similar functions written for Marr and Radial Hash algorithms.
All public api with scaladocs decsribed in object PHash
.
Example
import scalaphash.PHash._
import javax.imageio.ImageIO
val img1 = ImageIO.read(new File("img1.jpg"))
val img2 = ImageIO.read(new File("img2.jpg"))
val radialDistance: Either[Throwable, Double] = for {
img1rad <- radialHash(img1)
img2rad <- radialHash(img2)
} yield radialHashDistance(img1rad, img2rad)
radialDistance.foreach {
case distance if distance > 0.95 => println("similar")
case _ => println("not similar")
}
radialDistance.left.foreach(e => println(e.getMessage))
Radial distance is more when images are similar. DCT and Marr distances are less when images are similar. Recommended to make a decision on image similarity when at least two hashes pass thresholds.


radial: 0.9508017124330319
dct: 13
marr: 0.5052083333333334


radial: 0.3996241672331173
dct: 41
marr: 0.4704861111111111
Warning
My results is not compatible with original pHash. Use original library if you have an opportunity.
Also, it works much slower than c++ version (about 5-7 times).