dhash-vips icon indicating copy to clipboard operation
dhash-vips copied to clipboard

Comparison quality loss, probably due to libvips update

Open Nakilon opened this issue 2 years ago • 12 comments

Hello @jcupitt. I believe that with some of the libvips updates my gem has lost a comparison quality a bit but it was tedious to install different versions of the library (and I don't really have a machine right now to make it compile things a lot) so I didn't figure out when exactly it happened or if the quality would return if I switch back. But I decided to inform you anyway. (I'll switch this Issue to Discussions tab later)

It may seem to be difficult to reproduce (and partially that's why I'm noting this only after a year or so) but I suppose that somewhere between vips-8.9.1 and vips-8.11.3 there was a change/tweak in the interpolation algorithm IIRC and it made my perceptual fingerprints worse. Specifically for the DHash in this line https://github.com/Nakilon/dhash-vips/commit/b6e0945b44d47b46b7df3a4b9e8cd479b813650d#diff-a7bce69e0d9f02a55d36f1b84429134d052557c4aaaeaf5caa6a0ca73d72c437R8 it means that two images had the difference 2 between them but became as close as 0, and another pair is just some b/w filter that was making the distance of 4 but now it's 6. For the IDHash "the most distant among the least distant" image pairs became of the same distance as "the least distant among the most distant" image pairs https://github.com/Nakilon/dhash-vips/commit/b6e0945b44d47b46b7df3a4b9e8cd479b813650d#diff-a7bce69e0d9f02a55d36f1b84429134d052557c4aaaeaf5caa6a0ca73d72c437R46 and so the test reveals that it became harder to find a good threshold and so IDHash has almost reached the Dhash gem quality: https://github.com/Nakilon/dhash-vips/commit/b6e0945b44d47b46b7df3a4b9e8cd479b813650d#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R152

Nakilon avatar Mar 01 '22 18:03 Nakilon

Oh, I forgot that also have a Github Action here that runs the benchmark, so it reveals the difference too, between my macOS:

ruby 2.3.8p459 (2018-10-18 revision 65136) [x86_64-darwin18]
vips-8.11.3-Wed Aug 11 09:29:27 UTC 2021
Version: ImageMagick 6.9.12-23 Q16 x86_64 2021-09-18 https://imagemagick.org
Intel(R) Core(TM) i5-7360U CPU @ 2.30GHz
                                          
            Fingerprint  Compare  1/FMI^2 
  Phamilie        4.473    0.674    4.000 
     Dhash        5.142    1.162    1.375 
     DHash        0.410    1.090    1.778 
    IDHash        0.342    0.150    1.306 

and the server Linux:

ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-linux]
Version: 8.7.4-1+deb10u1
Version: 8:6.9.10.23+dfsg-2.1+deb10u1
model name	: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz

            Fingerprint  Compare  1/FMI^2 
  Phamilie       19.115    0.960    4.000 
     Dhash        4.658    1.408    1.306 
     DHash        0.519    1.429    1.556 
    IDHash        0.506    0.258    1.250 

(see the last column, lower numbers are the "better quality")

The libvips version is probably determined here that is just apt install libvips42 on ruby:2.7.2-slim: https://github.com/Nakilon/dhash-vips/blob/master/.github/workflows/benchmark.yaml#L10

Nakilon avatar Mar 01 '22 18:03 Nakilon

Hi @Nakilon,

There was a change in 8.10 to improve compatibility -- previously, resize shifted the image by 0.5 pixels compared to imagemagick and this made a lot of people complain. We made centre-based sampling the default and removed the centre option. It might have changed rounding behaviour too, I forget.

Could that be it?

jcupitt avatar Mar 02 '22 10:03 jcupitt

These tickets? https://github.com/libvips/libvips/pull/1592 https://github.com/libvips/libvips/pull/1769 https://github.com/libvips/libvips/pull/1872 They sound like it was a good change. I'm confused why it had a bad effect for me, hmm.

Nakilon avatar Mar 02 '22 15:03 Nakilon

It might be something else, that was just the change I thought of. Do you see better results if you undo those patches?

jcupitt avatar Mar 02 '22 18:03 jcupitt

It may also be caused by https://github.com/libvips/libvips/discussions/2047. If you have issues with that, you could disable the vector path with the VIPS_NOVECTOR=1 environment variable or you could use this in Ruby:

Vips.vector_set false

kleisauke avatar Apr 16 '22 12:04 kleisauke

After applying these patches (to fix various runtime errors):

Details
LoadError: /usr/lib64/gems/ruby/rmagick-2.16.0/RMagick2.so: undefined symbol: R__Bool_to_C__Bool - /usr/lib64/gems/ruby/rmagick-2.16.0/RMagick2.so
--- a/Gemfile
+++ b/Gemfile
@@ -1,7 +1,7 @@
 source "https://rubygems.org"
 
 gem "rake"
-gem "rmagick", "~>2.16"
+gem "rmagick"
 gem "dhash"
 gem "phamilie"
 gem "get_process_mem"
[DEPRECATION] requiring "RMagick" is deprecated. Use "rmagick" instead
NameError: uninitialized constant Magick::Rec601LumaColorspace
Did you mean?  Magick::Rec601YCbCrColorspace
--- a/usr/share/gems/gems/dhash-0.0.2/lib/dhash.rb
+++ b/usr/share/gems/gems/dhash-0.0.2/lib/dhash.rb
@@ -1,5 +1,5 @@
 require 'dhash/version'
-require 'RMagick'
+require 'rmagick'
 
 module Dhash extend self
   def hamming(a, b)
@@ -8,7 +8,7 @@ module Dhash extend self
 
   def calculate(file, hash_size = 8)
     image = Magick::Image.read(file).first
-    image = image.quantize(256, Magick::Rec601LumaColorspace, Magick::NoDitherMethod, 8)
+    image = image.quantize(256, Magick::GRAYColorspace, Magick::NoDitherMethod, 8)
     image = image.resize!(hash_size + 1, hash_size)
 
     difference = []

(according to https://imagemagick.org/script/porting.php#colorspace, Magick::Rec601LumaColorspace should be Magick::GRAYColorspace)

NoMethodError: undefined method `distance3_c' for DHashVips::IDHash:Module
Did you mean?  distance3
               distance
--- a/Rakefile
+++ b/Rakefile
@@ -263,7 +263,6 @@ task :compare_speed do
       [DHashVips::DHash, :hamming],
       [DHashVips::IDHash, :distance],
       [DHashVips::IDHash, :distance3_ruby],
-      [DHashVips::IDHash, :distance3_c],
       [DHashVips::IDHash, :distance, 4],
     ].zip(hashes) do |(m, dm, power, ii), hs|
       bm.report "#{m.is_a?(Module) ? m : m.class} #{dm} #{power}" do
@@ -328,7 +327,7 @@ task :benchmark do
     [phamilie, :distance, nil, filenames.flatten],
     [Dhash, :hamming],
     [DHashVips::DHash, :hamming],
-    [DHashVips::IDHash, :distance3_c],
+    [DHashVips::IDHash, :distance3],
   ].zip(hashes).map do |(m, dm, power, ii), hs|
     Benchmark.realtime do
       _ = ii || hs

(not sure about this one)

I see:

$ VIPS_NOVECTOR=1 rake benchmark
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]
vips-8.12.2-Tue Jan 25 09:34:32 UTC 2022
Version: ImageMagick 6.9.12-44 Q16 x86_64 17145 https://imagemagick.org
model name	: Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz
image groups sizes: [1, 2, 2, 2, 2, 2, 2, 2, 2, 1]
step 1 / 3 (fingerprinting)
step 2 / 3 (comparing fingerprints)
step 3 / 3 (looking for the best threshold)
                                          
            Fingerprint  Compare  1/FMI^2 
  Phamilie        2.078    0.622    4.000 
     Dhash        2.878    0.900    1.250 
     DHash        0.259    0.906    1.633 
    IDHash        0.240    1.528    1.125 
                                          
(lower numbers are better)

And with PR https://github.com/libvips/libvips/pull/1769:

$ VIPS_NOVECTOR=1 rake benchmark
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]
vips-8.13.0
Version: ImageMagick 6.9.12-44 Q16 x86_64 17145 https://imagemagick.org
model name	: Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz
image groups sizes: [1, 2, 2, 2, 2, 2, 2, 2, 2, 1]
step 1 / 3 (fingerprinting)
step 2 / 3 (comparing fingerprints)
step 3 / 3 (looking for the best threshold)
                                          
            Fingerprint  Compare  1/FMI^2 
  Phamilie        2.051    0.630    4.000 
     Dhash        2.792    0.893    1.250 
     DHash        0.264    0.908    1.556 
    IDHash        0.245    1.590    1.250 
                                          
(lower numbers are better)

I guess I should look at the last column of the "DHash" row? If so, it seems that the above PR produces a lower number (1.556) than with libvips 8.12.2 (1.633).

kleisauke avatar Apr 16 '22 14:04 kleisauke

@kleisauke thank you for diving in!

I never could figure out what to do with that rmagick colorscheme error. Now with newer rmagick gem version the dhash gem works better.

The distance3_c error means the gem didn't compile native extension due to lack of the needed Ruby source files. It falls back to Ruby implementation of the comparison function and should not affect the quality benchmark column.

Yes, the 1.556 1.250 are exactly what it was before. Though here I like the 1.633 1.125 more, because it means IDHash will work better than dhash and DHash, hehe.

Do I understand right that Vips.vector_set false disables some libvips improvement that is good according to other tests? If so then I suppose it either appeared to be worse for "resizing to 8x8" or my tests are not precise enough, i.e. need more test images, hmm.

Nakilon avatar Apr 17 '22 10:04 Nakilon

A bit more detailed reason why IDHash quality estimation number went up (with setting the libvips parameter) and dhash went down (with switching the colorscheme) is here: https://github.com/Nakilon/dhash-vips/commit/5810193b38a5f4d99b6ec2984f974f87926b616b#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L110-R111 that means that the "Similar images" and "Different images" distance ranges should not overlap too much because that make it unable to find the border between the two classes of distances. The overlap for IDHash went from 17..22 to 21..22

Nakilon avatar Apr 17 '22 11:04 Nakilon

I tried to port sharp's dHash function to this library: https://github.com/lovell/sharp/blob/985e881e7a764d9cb317938b36ec1f118808787b/test/fixtures/index.js#L12-L36 https://github.com/lovell/sharp/blob/985e881e7a764d9cb317938b36ec1f118808787b/src/operations.cc#L57-L93

See for example this patch:

Details
--- a/lib/dhash-vips.rb
+++ b/lib/dhash-vips.rb
@@ -72,6 +72,51 @@ module DHashVips
       ((a ^ b) & (a | b) >> 2 * size_a * size_a).to_s(2).count "1"
     end
 
+    def normalize image
+      # Get original colorspace
+      type_before_normalize = image.interpretation
+
+      # Need to tag generic RGB interpretation as sRGB before libvips 8.9.0
+      type_before_normalize = :srgb if type_before_normalize == :rgb
+
+      # Convert to LAB colorspace
+      lab = image.colourspace :lab
+
+      # Extract luminance
+      luminance = lab[0]
+
+      # Find luminance range
+      stats = luminance.stats
+
+      min = stats.getpoint(0, 0)[0]
+      max = stats.getpoint(1, 0)[0]
+
+      return image if min == max
+
+      # Extract chroma
+      chroma = lab.extract_band 1, n: 2
+
+      # Calculate multiplication factor and addition
+      f = 100.0 / (max - min)
+      a = -(min * f)
+
+      # Scale luminance, join to chroma, convert back to original colorspace
+      normalized = luminance.linear(f, a)
+        .bandjoin(chroma)
+        .colourspace(type_before_normalize)
+
+      # Attach original alpha channel, if any
+      if image.has_alpha?
+        # Extract original alpha channel
+        alpha = image[image.bands - 1]
+
+        # Join alpha channel to normalised image
+        normalized.bandjoin(alpha)
+      end
+
+      normalized
+    end
+
     @median = lambda do |array|
       h = array.size / 2
       return array[h] if array[h] != array[h - 1]
@@ -94,8 +139,19 @@ module DHashVips
 
     def fingerprint filename, power = 3
       size = 2 ** power
-      image = Vips::Image.new_from_file filename, access: :sequential
-      image = image.resize(size.fdiv(image.width), vscale: size.fdiv(image.height)).colourspace("b-w").flatten
+      image = Vips::Image.thumbnail filename, size, height: size, size: :force
+
+      # Need to copy to memory, we have to stay seq
+      image = image.copy_memory
+
+      # Apply normalisation - stretch luminance to cover full dynamic range
+      image = normalize image
+
+      # Flatten it to mid-gray before generating a fingerprint
+      image = image.flatten(background: [128, 128, 128]) if image.has_alpha?
+
+      # Convert to greyscale, extract first band
+      image = image.colourspace(:b_w)[0]
 
       array = image.to_enum.map &:flatten
       d1, i1, d2, i2 = [array, array.transpose].flat_map do |a|

This thumbnails with shrink-on-load enabled and normalizes the image by stretching the luminance to cover full dynamic range. With that, I see:

$ rake benchmark
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]
vips-8.12.2-Tue Jan 25 09:34:32 UTC 2022
Version: ImageMagick 6.9.12-44 Q16 x86_64 17145 https://imagemagick.org
model name	: Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz
gem ruby-vips version 2.1.4
gem rmagick version 4.2.5
image groups sizes: [1, 2, 2, 2, 2, 2, 2, 2, 2, 1]
step 1 / 3 (fingerprinting)
step 2 / 3 (comparing fingerprints)
step 3 / 3 (looking for the best threshold)
                                          
            Fingerprint  Compare  1/FMI^2 
  Phamilie        2.079    0.597    4.000 
     Dhash        2.875    0.893    1.250 
     DHash        0.268    0.899    1.633 
    IDHash        0.205    1.596    1.000 
                                          
(lower numbers are better)

So, I guess this is the best we could get? (Or perhaps I made a mistake somewhere?)

Do I understand right that Vips.vector_set false disables some libvips improvement that is good according to other tests?

It disables the vector path of reducev, which should produce better quality images at the expense of performance. Hopefully a future libvips version would fix this as well, as I'm currently in the process of porting the Pillow-SIMD resizer (which does not have this quality issue) to libvips.

kleisauke avatar Apr 17 '22 12:04 kleisauke

Wow, looks like the trick is in putting the flatten before converting to b_w. Maybe it's a bit slower but we already have some improvement by using the thumbnail method (I hesitated to use it earlier, saw no point, but looks like I totally missed the fact that it has the "shrink-on-load").

     def self.fingerprint filename, power = 3
       size = 2 ** power
-      image = Vips::Image.new_from_file filename, access: :sequential
-      image = image.resize(size.fdiv(image.width), vscale: size.fdiv(image.height)).colourspace("b-w").flatten
-
+      image = Vips::Image.thumbnail(filename, size, height: size, size: :force).flatten.colourspace("b-w")[0]
       array = image.to_enum.map &:flatten
       d1, i1, d2, i2 = [array, array.transpose].flat_map do |a|
         d = a.zip(a.rotate(1)).flat_map{ |r1, r2| r1.zip(r2).map{ |i, j| i - j } }

It is in master.

Now probably the last thing to check would be that https://github.com/libvips/libvips/pull/1769 does not break the 1.000 score.

Nakilon avatar Apr 18 '22 12:04 Nakilon

Great! PR https://github.com/libvips/libvips/pull/1769 does break the 1.000 score of IDHash.

$ rake benchmark
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]
vips-8.13.0
Version: ImageMagick 6.9.12-45 Q16 x86_64 17184 https://imagemagick.org
model name	: Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz
gem ruby-vips version 2.1.4
gem rmagick version 4.2.5
image groups sizes: [1, 2, 2, 2, 2, 2, 2, 2, 2, 1]
step 1 / 3 (fingerprinting)
step 2 / 3 (comparing fingerprints)
step 3 / 3 (looking for the best threshold)
                                          
            Fingerprint  Compare  1/FMI^2 
  Phamilie        2.070    0.690    4.000 
     Dhash        3.997    0.955    1.250 
     DHash        0.181    0.934    1.250 
    IDHash        0.154    2.341    1.125 
                                          
(lower numbers are better)

(although, the DHash quality seems to have improved)

I'm not sure why this is happening. That PR seems like a good overall improvement, there are some unit tests added in that PR that would otherwise fail: https://github.com/libvips/libvips/blob/a301a5c9b774421ec6a1c01de223964284a2643c/test/test-suite/test_resample.py#L118-L137

For example, resizing a 1600x1000 image to 10 pixels wide now generates a 10x6 image.

$ vips black x.jpg 1600 1000
$ vipsthumbnail x.jpg -s 10x -o vips.jpg
$ convert x.jpg -thumbnail 10x magick.jpg
$ vipsheader vips.jpg magick.jpg
vips.jpg: 10x6 uchar, 1 band, b-w, jpegload
magick.jpg: 10x6 uchar, 1 band, b-w, jpegload

Previous this would result in a 10x7 image.

kleisauke avatar Apr 19 '22 09:04 kleisauke

Increasing the reducing gap here to 3.0: https://github.com/kleisauke/libvips/blob/1a219366a0a667f48510afbd2d622ad4279268b4/libvips/resample/resize.c#L160-L163 (which would ensure that always the final 300% is done by vips_reduce{h,v})

Produces:

$ rake benchmark
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]
vips-8.13.0
Version: ImageMagick 6.9.12-45 Q16 x86_64 17184 https://imagemagick.org
model name	: Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz
gem ruby-vips version 2.1.4
gem rmagick version 4.2.5
image groups sizes: [1, 2, 2, 2, 2, 2, 2, 2, 2, 1]
step 1 / 3 (fingerprinting)
step 2 / 3 (comparing fingerprints)
step 3 / 3 (looking for the best threshold)
                                          
            Fingerprint  Compare  1/FMI^2 
  Phamilie        2.056    0.607    4.000 
     Dhash        3.116    0.923    1.250 
     DHash        0.177    0.926    1.250 
    IDHash        0.151    2.199    1.000 
                                          
(lower numbers are better)

This behavior is also confirmed by Pillow, on which this logic is somewhat based. https://pillow.readthedocs.io/en/stable/releasenotes/7.0.0.html#new-argument-reducing-gap-for-image-resize-and-image-thumbnail-methods

With reducing_gap greater or equal to 3.0, the result is indistinguishable from fair resampling.

kleisauke avatar Apr 21 '22 11:04 kleisauke

@kleisauke I just realised I didn't know that .flatten is dumb -- I thought it really checks for alpha. Only spotted it now in another project where (for arrayjoin) I supposed that flattening all the images would make them have the same bands count... Now I use your ifs, gladly the IDHash didn't get worse, only DHash for some reason.

(meanwhile I also decided to increment the third gem semver component when the change affects the numbers)

Nakilon avatar Nov 11 '22 06:11 Nakilon

only DHash for some reason.

Oh, curious. I assume that was this change?: https://github.com/Nakilon/dhash-vips/commit/9e38ae98d63cf35c83230b5916b44cfd370d243e#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R26

Which libvips version was that tested it on? The last time I tried this, the DHash score decreased to 1.250, but for some reason IDHash score increased to 1.125, when not using a gap of 3.0.

Now that I think about this further, I think shrink-on-load could possibly(?) affect that score as well, since it uses a block shrink for the first resize step. Although, at least a factor of two would still be done by the final resize step.

kleisauke avatar Nov 11 '22 13:11 kleisauke

Which libvips version was that tested it on?

$ vips -v
vips-8.11.3-Wed Aug 11 09:29:27 UTC 2021

Nakilon avatar Nov 11 '22 13:11 Nakilon

Ah, that version does not include the improvements landed via PR https://github.com/libvips/libvips/pull/1769, available since libvips 8.13.

So, it seems that flattening only alpha images increased the DHash score from 1.778 to 1.796, see for example your first comment: https://github.com/Nakilon/dhash-vips/issues/16#issuecomment-1055753274 (which was also tested with libvips 8.11.3).

FWIW, here's the updated patch that I used to produces the scores mentioned in https://github.com/Nakilon/dhash-vips/issues/16#issuecomment-1105081861.

Details
--- a/lib/dhash-vips.rb
+++ b/lib/dhash-vips.rb
@@ -12,11 +12,12 @@ module DHashVips
 
     def pixelate input, hash_size
       image = if input.is_a? Vips::Image
-        input.thumbnail_image(hash_size + 1, height: hash_size, size: :force)
+        input
       else
-        Vips::Image.thumbnail(input, hash_size + 1, height: hash_size, size: :force)
+        Vips::Image.new_from_file input, access: :sequential
       end
-      (image.has_alpha? ? image.flatten : image).colourspace("b-w")[0]
+      image = image.resize((hash_size + 1).fdiv(image.width), vscale: hash_size.fdiv(image.height), gap: 3.0)
+      image.flatten.colourspace("b-w")[0]
     end
 
     def calculate file, hash_size = 8
@@ -94,11 +95,13 @@ module DHashVips
     def self.fingerprint input, power = 3
       size = 2 ** power
       image = if input.is_a? Vips::Image
-        input.thumbnail_image(size, height: size, size: :force)
+        input
       else
-        Vips::Image.thumbnail(input, size, height: size, size: :force)
+        Vips::Image.new_from_file input, access: :sequential
       end
-      array = (image.has_alpha? ? image.flatten : image).flatten.colourspace("b-w")[0].to_enum.map &:flatten
+      image = image.resize(size.fdiv(image.width), vscale: size.fdiv(image.height), gap: 3.0)
+      image = image.flatten.colourspace("b-w")[0]
+      array = image.to_enum.map &:flatten
       d1, i1, d2, i2 = [array, array.transpose].flat_map do |a|
         d = a.zip(a.rotate(1)).flat_map{ |r1, r2| r1.zip(r2).map{ |i, j| i - j } }
         m = median d.map(&:abs).sort

So, that forcefully disables shrink-on-load, flattens on every image and uses a reducing gap of 3.0. Anything other than that would increase the IDHash/DHash score, which is curious. Here are the various things I tried w/ libvips 8.13.3:

  • Remove that gap argument (as that is only available in libvips >= 8.13) - causes the default value of 2.0 to be used;
  • Only do the flattening on alpha images (which is good, but somehow increases the score);
  • Re-enable shrink-on-load by using Vips::Image.thumbnail instead (should not make much difference in theory).

kleisauke avatar Nov 13 '22 14:11 kleisauke

Sigh... now in another project while using the gem for testing I discovered that it produced the hash 0xffff...0000.... for the image https://storage.yandexcloud.net/gems.nakilon.pro/dhash-vips/alpha_only.png

It appears that I misunderstood the .flatten even more, because it wasn't supposed to just cut the alpha channel off but in fact was converting the image to just zeros. So I used the composite+:screen:

https://github.com/Nakilon/dhash-vips/commit/354daf197846cffb53cdaf6d07470dae9bea85e6#diff-06bfa1add88aa00af234039a463c42fdd10e629d95926ed4d091b46d9696adaaR103

I hope I understand it correctly this time? ..D

I also added an explicit test for this image to avoid such bug in future.

I've also added one more test image pair and now the score is at 1.111 on 8.7.4 and 8.10.5 (ruby:slim docker). I'm okay with that it's not 1.0, I'll close the issue for now and release 0.2.2.0 since I guess we've localised the things, and several bugs were fixed since then, but you should be able to comment if have any more related updates or ideas. Or if I messed up the alpha channel thing again ) though I doubt now.

Nakilon avatar Dec 08 '22 13:12 Nakilon