opencv_contrib
opencv_contrib copied to clipboard
EMD L1 interpreting every input matrix as 1-dimensional data
I've switched over from using the CV EMD method (https://github.com/opencv/opencv/blob/4.x/modules/imgproc/src/emd.cpp) to using the EMDL1 implementation because of the massive speed benefit the paper it was based on was promising.
And at first it looked good, because I got a 1000x speed increase for the same date (comparison of 20x32 matrices).
But I somehow realised, that the description of the input paramters slightly differed, EMDs description says: 'First signature, a \f$\texttt{size1}\times \texttt{dims}+1\f$ floating-point matrix. Each row stores the point weight followed by the point coordinates.' while EMDL1 says: 'First signature, a single column floating-point matrix. Each row is the value of the histogram in each bin.'
So the same signature will be interpreted differently by the two functions, which I think is already a bug / very unintuitive.
But given the implementation of EMDL1 it is obvious, that it clearly has the capability to work with 2 or 3 dimensional data.
A simple change in line 64 of https://github.com/opencv/opencv_contrib/blob/4.x/modules/shape/src/emdL1.cpp to correctly determine the dimensionality like
'''
if (!initBaseTrees((int) sig1.at
does the trick for me, but I don't know if it's ok to assume that the signature has the right format to read out the last entry, nor if it is efficient to read it out like that. I will still propose a PR once I find the time.
I would really appreciate some discussion about this.
I also found, that the maxIterations are not exposed. The results where randomly the same as EMD(sig1, sig2, DIST_L1) and randomly not, so I would like to find a way to cleanly explose the setMaxIterations.
My solution as proposed is not correct for signatures of 3d matrices, so should be revised anyways.
Max Iterations is not documented