mpicbg
mpicbg copied to clipboard
Slightly speed up SIFT matching
Profiling a downstream match derivation process using the MPICBG implementation of SIFT revealed that the single largest chunk of runtime was spent on computing the distance of the derived features. By unrolling the corresponding for-loop, I was able to speed the match derivation up a little bit: ~30% with descriptor size parameter 4, ~40% with descriptor size parameter 7. (As far as I see, the descriptor size parameter is just proportional to the actual descriptor size, with the actual size always divisible by 4.)
Even though the whole thing could be speed up even further (by a low single digit %-value) using float
throughout the computation instead of double
, I decided not to do that in order to not loose any accuracy.
Let me know what you think, @axtimwalde & @StephanPreibisch! In particular, can the distance computation also be done with floats?
Edit: the speedup of about 30% seems to be consistent across architectures. I tested this on x64 (Intel Xeon Gold) and Apple Silicon (M4 Max).