Faster resize implementation
@qarmin Sorry to say something so completely off-topic but, with issues turned off and no apparent contact options enabled on your profile...
I find image::imageops::sample::resize taking up 45% of my flamegraph using this. The fact that you've disabled issues suggests, to me, that you may be trying to funnel feature patches back to the upstream project... but I actually want something where there's an apparent chance of getting my contributions released on crates.io.
If I make time to add support for faster implementations like resize or fast_image_resize, would you prefer I open a PR here or publish a fork of your fork?
Originally posted by @ssokolow in https://github.com/qarmin/img_hash/issues/3#issuecomment-1317917894
@ssokolow performance improvements, especially with fast_image_resize looks very promising so if crate will still be easy to use without major changes to the public api I will happily merge this commit
OK. I first need to do my due diligence to audit fast_image_resize (within what I'm capable of), which may take some time, but I'll put it on the horizon.
I'll probably experiment with designs where there's a method you can call to set a custom resize callback since that'd give people freedom to use any resize implementation they want without getting upstream approval while not changing the experience or dependency list for existing use-cases.
(It'd also avoid (or at least pass the buck on) the risk of making it difficult/impossible to play with the feature flags to support situations like my AMD CPU from 2012 that doesn't have either of those SIMD ISA extensions since it'd be the integrating crate that'd be handling the dependency specification for the resizing crate rather than img_hash.)
Looking into this myself, because resizing from a 1080p video was taking up ~45% of my runtime.
I tried scaling the images before passing them to image_hasher, and got good results performance-wise - image_hasher dropped to 0.3% for 32x32 prescale, and 0.1% for 8x8 (I assumed that's the default hasher config), and overall performance was definitely much faster.
In my case, I was reading frames using ffmpeg_next and already using their "scaling context" to change the pixel format to grayscale. All I had to do was change the output resolution too. In theory, this should also apply to any other resize algorithm like fast_image_resize.
It does seem to affect the results of the hash. I'm still very new to image hashing; hopefully I'll have some concrete data to share once I get more experience with this.
Any news on this front?
Not on my part. Other things came up and the project I use img_hash in hasn't yet reached the "make it fast" part of "make it work, make it right, make it fast".
With my fragile image modification skills, I tried to implement this and indeed the processing time is shorter(in the table the results of processing 120 256px images with all resize/calculate algorithm combinations(not very representative, but should be enough for now))
| default | fast_resize | sse41 | avx2 | |
|---|---|---|---|---|
| debug | 4.76s | 2.2s | 2.2s | 2.2s |
| release | 82ms | 49ms | 51ms | 49ms |
However, I have one big problem in https://github.com/qarmin/img_hash/pull/24. Algorithms give different results than image-rs. Maybe it's my fault, but I don't see in fast_image_resize the Triangle algorithm, even nearest is not 100% identical to image-rs and fast_resize provides as many as 3 kinds of algorithms Lanczos3, Gaussian or CatmullRom none of which gives identical results to the original.
So in my opinion there are several possible solutions to this:
- changing/improving/adding missing/incorrect algorithms to fast_image_resize
- leaving the library behind the optional feature and marking that fast_image_resize gives different results for most combinations
- not implementing it because it gives different results than expected
In https://github.com/qarmin/img_hash/pull/24 in version 3.0.0 I added experimental support for faster resizing, so feel free to test it.
Like I wrote before, it gives slightly different results than image-rs, I reported issue to fast_image_resize repo, but don't know if 100% compatibility with image-rs is goal of this project - https://github.com/Cykooz/fast_image_resize/issues/42