Calculation in katz_fd seems to be incorrect
The current implementation of katz_fd in antropy/fractal.py looks like the following:
def katz_fd(x, axis=-1):
x = np.asarray(x)
dists = np.abs(np.diff(x, axis=axis))
ll = dists.sum(axis=axis)
ln = np.log10(ll / dists.mean(axis=axis))
aux_d = x - np.take(x, indices=[0], axis=axis)
d = np.max(np.abs(aux_d), axis=axis)
kfd = np.squeeze(ln / (ln + np.log10(d / ll)))
if not kfd.ndim:
kfd = kfd.item()
return kfd
However, both in the documentation and in the original paper the distance (dist and d) is defined as the Euclidean distance between the given points, which is calculated differently.
A BASIC implementation was also provided in the Appendix of the original paper:
The important differences are in lines 130 and 150. (Note: the exponential operator ^ probably missing due to scanning/printing issues)
With this and by looking at another implementation in MATLAB, I think the above code for katz_fd should be changed to represent the original definition. However, I do not know if this change would break existing works using the current (apparently incorrect) functionality, so I propose the following change:
- Do not remove the current implementation, but include a flag to switch between the "legacy" version and the new version.
- Add a warning if the legacy version is used
- Change the
distsanddcalculations to something like the following (Note: this was only tested with single channel data)
ind = np.arange(len(x)-1)
A = np.stack((ind,x[:-1]))
B = np.stack((ind+1,x[1:]))
dists = np.linalg.norm(B-A,axis=0)
ind = np.arange(len(x))
A = np.stack((ind,x))
first = np.reshape([0,x[0]],(2,1))
aux_d = np.linalg.norm(A-first,axis=0)
d = np.max(aux_d)
If there are any comments or suggestions, please let me know. (Also this is my first contribution to an open source project, so I am a bit unsure about the etiquette/conventions)
Hi @Khaktos,
Thank you for the deep dive and catching this issue ! My opinion is that if the current implementation is incorrect, then we should just go ahead and release a new version of antropy with the correction. Users can decide to downgrade (or not upgrade) their version if they really want to maintain to the old behavior.
If that sounds ok, could you please submit a PR, ideally including unit tests against the Matlab implementation (assuming that it is correct)?
Thanks again, Raphael