dhash-vips
dhash-vips copied to clipboard
Comparison quality loss, probably due to libvips update
Hello @jcupitt. I believe that with some of the libvips updates my gem has lost a comparison quality a bit but it was tedious to install different versions of the library (and I don't really have a machine right now to make it compile things a lot) so I didn't figure out when exactly it happened or if the quality would return if I switch back. But I decided to inform you anyway. (I'll switch this Issue to Discussions tab later)
It may seem to be difficult to reproduce (and partially that's why I'm noting this only after a year or so) but I suppose that somewhere between vips-8.9.1 and vips-8.11.3 there was a change/tweak in the interpolation algorithm IIRC and it made my perceptual fingerprints worse. Specifically for the DHash in this line https://github.com/Nakilon/dhash-vips/commit/b6e0945b44d47b46b7df3a4b9e8cd479b813650d#diff-a7bce69e0d9f02a55d36f1b84429134d052557c4aaaeaf5caa6a0ca73d72c437R8 it means that two images had the difference 2 between them but became as close as 0, and another pair is just some b/w filter that was making the distance of 4 but now it's 6. For the IDHash "the most distant among the least distant" image pairs became of the same distance as "the least distant among the most distant" image pairs https://github.com/Nakilon/dhash-vips/commit/b6e0945b44d47b46b7df3a4b9e8cd479b813650d#diff-a7bce69e0d9f02a55d36f1b84429134d052557c4aaaeaf5caa6a0ca73d72c437R46 and so the test reveals that it became harder to find a good threshold and so IDHash has almost reached the Dhash gem quality: https://github.com/Nakilon/dhash-vips/commit/b6e0945b44d47b46b7df3a4b9e8cd479b813650d#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R152
Oh, I forgot that also have a Github Action here that runs the benchmark, so it reveals the difference too, between my macOS:
ruby 2.3.8p459 (2018-10-18 revision 65136) [x86_64-darwin18]
vips-8.11.3-Wed Aug 11 09:29:27 UTC 2021
Version: ImageMagick 6.9.12-23 Q16 x86_64 2021-09-18 https://imagemagick.org
Intel(R) Core(TM) i5-7360U CPU @ 2.30GHz
Fingerprint Compare 1/FMI^2
Phamilie 4.473 0.674 4.000
Dhash 5.142 1.162 1.375
DHash 0.410 1.090 1.778
IDHash 0.342 0.150 1.306
and the server Linux:
ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-linux]
Version: 8.7.4-1+deb10u1
Version: 8:6.9.10.23+dfsg-2.1+deb10u1
model name : Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Fingerprint Compare 1/FMI^2
Phamilie 19.115 0.960 4.000
Dhash 4.658 1.408 1.306
DHash 0.519 1.429 1.556
IDHash 0.506 0.258 1.250
(see the last column, lower numbers are the "better quality")
The libvips version is probably determined here that is just apt install libvips42
on ruby:2.7.2-slim
: https://github.com/Nakilon/dhash-vips/blob/master/.github/workflows/benchmark.yaml#L10
Hi @Nakilon,
There was a change in 8.10 to improve compatibility -- previously, resize shifted the image by 0.5 pixels compared to imagemagick and this made a lot of people complain. We made centre-based sampling the default and removed the centre
option. It might have changed rounding behaviour too, I forget.
Could that be it?
These tickets? https://github.com/libvips/libvips/pull/1592 https://github.com/libvips/libvips/pull/1769 https://github.com/libvips/libvips/pull/1872 They sound like it was a good change. I'm confused why it had a bad effect for me, hmm.
It might be something else, that was just the change I thought of. Do you see better results if you undo those patches?
It may also be caused by https://github.com/libvips/libvips/discussions/2047. If you have issues with that, you could disable the vector path with the VIPS_NOVECTOR=1
environment variable or you could use this in Ruby:
Vips.vector_set false
After applying these patches (to fix various runtime errors):
Details
LoadError: /usr/lib64/gems/ruby/rmagick-2.16.0/RMagick2.so: undefined symbol: R__Bool_to_C__Bool - /usr/lib64/gems/ruby/rmagick-2.16.0/RMagick2.so
--- a/Gemfile
+++ b/Gemfile
@@ -1,7 +1,7 @@
source "https://rubygems.org"
gem "rake"
-gem "rmagick", "~>2.16"
+gem "rmagick"
gem "dhash"
gem "phamilie"
gem "get_process_mem"
[DEPRECATION] requiring "RMagick" is deprecated. Use "rmagick" instead
NameError: uninitialized constant Magick::Rec601LumaColorspace
Did you mean? Magick::Rec601YCbCrColorspace
--- a/usr/share/gems/gems/dhash-0.0.2/lib/dhash.rb
+++ b/usr/share/gems/gems/dhash-0.0.2/lib/dhash.rb
@@ -1,5 +1,5 @@
require 'dhash/version'
-require 'RMagick'
+require 'rmagick'
module Dhash extend self
def hamming(a, b)
@@ -8,7 +8,7 @@ module Dhash extend self
def calculate(file, hash_size = 8)
image = Magick::Image.read(file).first
- image = image.quantize(256, Magick::Rec601LumaColorspace, Magick::NoDitherMethod, 8)
+ image = image.quantize(256, Magick::GRAYColorspace, Magick::NoDitherMethod, 8)
image = image.resize!(hash_size + 1, hash_size)
difference = []
(according to https://imagemagick.org/script/porting.php#colorspace, Magick::Rec601LumaColorspace
should be Magick::GRAYColorspace
)
NoMethodError: undefined method `distance3_c' for DHashVips::IDHash:Module
Did you mean? distance3
distance
--- a/Rakefile
+++ b/Rakefile
@@ -263,7 +263,6 @@ task :compare_speed do
[DHashVips::DHash, :hamming],
[DHashVips::IDHash, :distance],
[DHashVips::IDHash, :distance3_ruby],
- [DHashVips::IDHash, :distance3_c],
[DHashVips::IDHash, :distance, 4],
].zip(hashes) do |(m, dm, power, ii), hs|
bm.report "#{m.is_a?(Module) ? m : m.class} #{dm} #{power}" do
@@ -328,7 +327,7 @@ task :benchmark do
[phamilie, :distance, nil, filenames.flatten],
[Dhash, :hamming],
[DHashVips::DHash, :hamming],
- [DHashVips::IDHash, :distance3_c],
+ [DHashVips::IDHash, :distance3],
].zip(hashes).map do |(m, dm, power, ii), hs|
Benchmark.realtime do
_ = ii || hs
(not sure about this one)
I see:
$ VIPS_NOVECTOR=1 rake benchmark
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]
vips-8.12.2-Tue Jan 25 09:34:32 UTC 2022
Version: ImageMagick 6.9.12-44 Q16 x86_64 17145 https://imagemagick.org
model name : Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz
image groups sizes: [1, 2, 2, 2, 2, 2, 2, 2, 2, 1]
step 1 / 3 (fingerprinting)
step 2 / 3 (comparing fingerprints)
step 3 / 3 (looking for the best threshold)
Fingerprint Compare 1/FMI^2
Phamilie 2.078 0.622 4.000
Dhash 2.878 0.900 1.250
DHash 0.259 0.906 1.633
IDHash 0.240 1.528 1.125
(lower numbers are better)
And with PR https://github.com/libvips/libvips/pull/1769:
$ VIPS_NOVECTOR=1 rake benchmark
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]
vips-8.13.0
Version: ImageMagick 6.9.12-44 Q16 x86_64 17145 https://imagemagick.org
model name : Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz
image groups sizes: [1, 2, 2, 2, 2, 2, 2, 2, 2, 1]
step 1 / 3 (fingerprinting)
step 2 / 3 (comparing fingerprints)
step 3 / 3 (looking for the best threshold)
Fingerprint Compare 1/FMI^2
Phamilie 2.051 0.630 4.000
Dhash 2.792 0.893 1.250
DHash 0.264 0.908 1.556
IDHash 0.245 1.590 1.250
(lower numbers are better)
I guess I should look at the last column of the "DHash" row? If so, it seems that the above PR produces a lower number (1.556
) than with libvips 8.12.2 (1.633
).
@kleisauke thank you for diving in!
I never could figure out what to do with that rmagick colorscheme error. Now with newer rmagick gem version the dhash gem works better.
The distance3_c
error means the gem didn't compile native extension due to lack of the needed Ruby source files. It falls back to Ruby implementation of the comparison function and should not affect the quality benchmark column.
Yes, the 1.556 1.250
are exactly what it was before. Though here I like the 1.633 1.125
more, because it means IDHash will work better than dhash and DHash, hehe.
Do I understand right that Vips.vector_set false
disables some libvips improvement that is good according to other tests? If so then I suppose it either appeared to be worse for "resizing to 8x8" or my tests are not precise enough, i.e. need more test images, hmm.
A bit more detailed reason why IDHash quality estimation number went up (with setting the libvips parameter) and dhash went down (with switching the colorscheme) is here: https://github.com/Nakilon/dhash-vips/commit/5810193b38a5f4d99b6ec2984f974f87926b616b#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L110-R111 that means that the "Similar images" and "Different images" distance ranges should not overlap too much because that make it unable to find the border between the two classes of distances. The overlap for IDHash went from 17..22 to 21..22
I tried to port sharp's dHash function to this library: https://github.com/lovell/sharp/blob/985e881e7a764d9cb317938b36ec1f118808787b/test/fixtures/index.js#L12-L36 https://github.com/lovell/sharp/blob/985e881e7a764d9cb317938b36ec1f118808787b/src/operations.cc#L57-L93
See for example this patch:
Details
--- a/lib/dhash-vips.rb
+++ b/lib/dhash-vips.rb
@@ -72,6 +72,51 @@ module DHashVips
((a ^ b) & (a | b) >> 2 * size_a * size_a).to_s(2).count "1"
end
+ def normalize image
+ # Get original colorspace
+ type_before_normalize = image.interpretation
+
+ # Need to tag generic RGB interpretation as sRGB before libvips 8.9.0
+ type_before_normalize = :srgb if type_before_normalize == :rgb
+
+ # Convert to LAB colorspace
+ lab = image.colourspace :lab
+
+ # Extract luminance
+ luminance = lab[0]
+
+ # Find luminance range
+ stats = luminance.stats
+
+ min = stats.getpoint(0, 0)[0]
+ max = stats.getpoint(1, 0)[0]
+
+ return image if min == max
+
+ # Extract chroma
+ chroma = lab.extract_band 1, n: 2
+
+ # Calculate multiplication factor and addition
+ f = 100.0 / (max - min)
+ a = -(min * f)
+
+ # Scale luminance, join to chroma, convert back to original colorspace
+ normalized = luminance.linear(f, a)
+ .bandjoin(chroma)
+ .colourspace(type_before_normalize)
+
+ # Attach original alpha channel, if any
+ if image.has_alpha?
+ # Extract original alpha channel
+ alpha = image[image.bands - 1]
+
+ # Join alpha channel to normalised image
+ normalized.bandjoin(alpha)
+ end
+
+ normalized
+ end
+
@median = lambda do |array|
h = array.size / 2
return array[h] if array[h] != array[h - 1]
@@ -94,8 +139,19 @@ module DHashVips
def fingerprint filename, power = 3
size = 2 ** power
- image = Vips::Image.new_from_file filename, access: :sequential
- image = image.resize(size.fdiv(image.width), vscale: size.fdiv(image.height)).colourspace("b-w").flatten
+ image = Vips::Image.thumbnail filename, size, height: size, size: :force
+
+ # Need to copy to memory, we have to stay seq
+ image = image.copy_memory
+
+ # Apply normalisation - stretch luminance to cover full dynamic range
+ image = normalize image
+
+ # Flatten it to mid-gray before generating a fingerprint
+ image = image.flatten(background: [128, 128, 128]) if image.has_alpha?
+
+ # Convert to greyscale, extract first band
+ image = image.colourspace(:b_w)[0]
array = image.to_enum.map &:flatten
d1, i1, d2, i2 = [array, array.transpose].flat_map do |a|
This thumbnails with shrink-on-load enabled and normalizes the image by stretching the luminance to cover full dynamic range. With that, I see:
$ rake benchmark
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]
vips-8.12.2-Tue Jan 25 09:34:32 UTC 2022
Version: ImageMagick 6.9.12-44 Q16 x86_64 17145 https://imagemagick.org
model name : Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz
gem ruby-vips version 2.1.4
gem rmagick version 4.2.5
image groups sizes: [1, 2, 2, 2, 2, 2, 2, 2, 2, 1]
step 1 / 3 (fingerprinting)
step 2 / 3 (comparing fingerprints)
step 3 / 3 (looking for the best threshold)
Fingerprint Compare 1/FMI^2
Phamilie 2.079 0.597 4.000
Dhash 2.875 0.893 1.250
DHash 0.268 0.899 1.633
IDHash 0.205 1.596 1.000
(lower numbers are better)
So, I guess this is the best we could get? (Or perhaps I made a mistake somewhere?)
Do I understand right that
Vips.vector_set false
disables some libvips improvement that is good according to other tests?
It disables the vector path of reducev
, which should produce better quality images at the expense of performance. Hopefully a future libvips version would fix this as well, as I'm currently in the process of porting the Pillow-SIMD resizer (which does not have this quality issue) to libvips.
Wow, looks like the trick is in putting the flatten
before converting to b_w
. Maybe it's a bit slower but we already have some improvement by using the thumbnail
method (I hesitated to use it earlier, saw no point, but looks like I totally missed the fact that it has the "shrink-on-load").
def self.fingerprint filename, power = 3
size = 2 ** power
- image = Vips::Image.new_from_file filename, access: :sequential
- image = image.resize(size.fdiv(image.width), vscale: size.fdiv(image.height)).colourspace("b-w").flatten
-
+ image = Vips::Image.thumbnail(filename, size, height: size, size: :force).flatten.colourspace("b-w")[0]
array = image.to_enum.map &:flatten
d1, i1, d2, i2 = [array, array.transpose].flat_map do |a|
d = a.zip(a.rotate(1)).flat_map{ |r1, r2| r1.zip(r2).map{ |i, j| i - j } }
It is in master.
Now probably the last thing to check would be that https://github.com/libvips/libvips/pull/1769 does not break the 1.000 score.
Great! PR https://github.com/libvips/libvips/pull/1769 does break the 1.000 score of IDHash.
$ rake benchmark
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]
vips-8.13.0
Version: ImageMagick 6.9.12-45 Q16 x86_64 17184 https://imagemagick.org
model name : Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz
gem ruby-vips version 2.1.4
gem rmagick version 4.2.5
image groups sizes: [1, 2, 2, 2, 2, 2, 2, 2, 2, 1]
step 1 / 3 (fingerprinting)
step 2 / 3 (comparing fingerprints)
step 3 / 3 (looking for the best threshold)
Fingerprint Compare 1/FMI^2
Phamilie 2.070 0.690 4.000
Dhash 3.997 0.955 1.250
DHash 0.181 0.934 1.250
IDHash 0.154 2.341 1.125
(lower numbers are better)
(although, the DHash quality seems to have improved)
I'm not sure why this is happening. That PR seems like a good overall improvement, there are some unit tests added in that PR that would otherwise fail: https://github.com/libvips/libvips/blob/a301a5c9b774421ec6a1c01de223964284a2643c/test/test-suite/test_resample.py#L118-L137
For example, resizing a 1600x1000 image to 10 pixels wide now generates a 10x6 image.
$ vips black x.jpg 1600 1000
$ vipsthumbnail x.jpg -s 10x -o vips.jpg
$ convert x.jpg -thumbnail 10x magick.jpg
$ vipsheader vips.jpg magick.jpg
vips.jpg: 10x6 uchar, 1 band, b-w, jpegload
magick.jpg: 10x6 uchar, 1 band, b-w, jpegload
Previous this would result in a 10x7 image.
Increasing the reducing gap here to 3.0
:
https://github.com/kleisauke/libvips/blob/1a219366a0a667f48510afbd2d622ad4279268b4/libvips/resample/resize.c#L160-L163
(which would ensure that always the final 300% is done by vips_reduce{h,v}
)
Produces:
$ rake benchmark
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]
vips-8.13.0
Version: ImageMagick 6.9.12-45 Q16 x86_64 17184 https://imagemagick.org
model name : Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz
gem ruby-vips version 2.1.4
gem rmagick version 4.2.5
image groups sizes: [1, 2, 2, 2, 2, 2, 2, 2, 2, 1]
step 1 / 3 (fingerprinting)
step 2 / 3 (comparing fingerprints)
step 3 / 3 (looking for the best threshold)
Fingerprint Compare 1/FMI^2
Phamilie 2.056 0.607 4.000
Dhash 3.116 0.923 1.250
DHash 0.177 0.926 1.250
IDHash 0.151 2.199 1.000
(lower numbers are better)
This behavior is also confirmed by Pillow, on which this logic is somewhat based. https://pillow.readthedocs.io/en/stable/releasenotes/7.0.0.html#new-argument-reducing-gap-for-image-resize-and-image-thumbnail-methods
With
reducing_gap
greater or equal to 3.0, the result is indistinguishable from fair resampling.
@kleisauke
I just realised I didn't know that .flatten
is dumb -- I thought it really checks for alpha. Only spotted it now in another project where (for arrayjoin) I supposed that flattening all the images would make them have the same bands count...
Now I use your if
s, gladly the IDHash didn't get worse, only DHash for some reason.
(meanwhile I also decided to increment the third gem semver component when the change affects the numbers)
only DHash for some reason.
Oh, curious. I assume that was this change?: https://github.com/Nakilon/dhash-vips/commit/9e38ae98d63cf35c83230b5916b44cfd370d243e#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R26
Which libvips version was that tested it on? The last time I tried this, the DHash
score decreased to 1.250
, but for some reason IDHash
score increased to 1.125
, when not using a gap
of 3.0
.
Now that I think about this further, I think shrink-on-load could possibly(?) affect that score as well, since it uses a block shrink for the first resize step. Although, at least a factor of two would still be done by the final resize step.
Which libvips version was that tested it on?
$ vips -v
vips-8.11.3-Wed Aug 11 09:29:27 UTC 2021
Ah, that version does not include the improvements landed via PR https://github.com/libvips/libvips/pull/1769, available since libvips 8.13.
So, it seems that flattening only alpha images increased the DHash
score from 1.778
to 1.796
, see for example your first comment: https://github.com/Nakilon/dhash-vips/issues/16#issuecomment-1055753274 (which was also tested with libvips 8.11.3).
FWIW, here's the updated patch that I used to produces the scores mentioned in https://github.com/Nakilon/dhash-vips/issues/16#issuecomment-1105081861.
Details
--- a/lib/dhash-vips.rb
+++ b/lib/dhash-vips.rb
@@ -12,11 +12,12 @@ module DHashVips
def pixelate input, hash_size
image = if input.is_a? Vips::Image
- input.thumbnail_image(hash_size + 1, height: hash_size, size: :force)
+ input
else
- Vips::Image.thumbnail(input, hash_size + 1, height: hash_size, size: :force)
+ Vips::Image.new_from_file input, access: :sequential
end
- (image.has_alpha? ? image.flatten : image).colourspace("b-w")[0]
+ image = image.resize((hash_size + 1).fdiv(image.width), vscale: hash_size.fdiv(image.height), gap: 3.0)
+ image.flatten.colourspace("b-w")[0]
end
def calculate file, hash_size = 8
@@ -94,11 +95,13 @@ module DHashVips
def self.fingerprint input, power = 3
size = 2 ** power
image = if input.is_a? Vips::Image
- input.thumbnail_image(size, height: size, size: :force)
+ input
else
- Vips::Image.thumbnail(input, size, height: size, size: :force)
+ Vips::Image.new_from_file input, access: :sequential
end
- array = (image.has_alpha? ? image.flatten : image).flatten.colourspace("b-w")[0].to_enum.map &:flatten
+ image = image.resize(size.fdiv(image.width), vscale: size.fdiv(image.height), gap: 3.0)
+ image = image.flatten.colourspace("b-w")[0]
+ array = image.to_enum.map &:flatten
d1, i1, d2, i2 = [array, array.transpose].flat_map do |a|
d = a.zip(a.rotate(1)).flat_map{ |r1, r2| r1.zip(r2).map{ |i, j| i - j } }
m = median d.map(&:abs).sort
So, that forcefully disables shrink-on-load, flattens on every image and uses a reducing gap of 3.0
. Anything other than that would increase the IDHash
/DHash
score, which is curious. Here are the various things I tried w/ libvips 8.13.3:
- Remove that
gap
argument (as that is only available in libvips >= 8.13) - causes the default value of2.0
to be used; - Only do the flattening on alpha images (which is good, but somehow increases the score);
- Re-enable shrink-on-load by using
Vips::Image.thumbnail
instead (should not make much difference in theory).
Sigh... now in another project while using the gem for testing I discovered that it produced the hash 0xffff...0000....
for the image https://storage.yandexcloud.net/gems.nakilon.pro/dhash-vips/alpha_only.png
It appears that I misunderstood the .flatten
even more, because it wasn't supposed to just cut the alpha channel off but in fact was converting the image to just zeros. So I used the composite
+:screen
:
https://github.com/Nakilon/dhash-vips/commit/354daf197846cffb53cdaf6d07470dae9bea85e6#diff-06bfa1add88aa00af234039a463c42fdd10e629d95926ed4d091b46d9696adaaR103
I hope I understand it correctly this time? ..D
I also added an explicit test for this image to avoid such bug in future.
I've also added one more test image pair and now the score is at 1.111 on 8.7.4 and 8.10.5 (ruby:slim docker). I'm okay with that it's not 1.0, I'll close the issue for now and release 0.2.2.0 since I guess we've localised the things, and several bugs were fixed since then, but you should be able to comment if have any more related updates or ideas. Or if I messed up the alpha channel thing again ) though I doubt now.