dfa
dfa copied to clipboard
Getting a NaN / 0 error and mean of empty slice — possible fix?
For my time series (where the deviations were very small and sometimes negligent) I was getting the following error:
/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py:3335: RuntimeWarning: Mean of empty slice.
out=out, **kwargs)
/usr/local/lib/python3.6/dist-packages/numpy/core/_methods.py:161: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
And my alpha
component was nan
(even though the chart was getting built, but the last value of fluct[e] was nAn.
So I changed the code in the dfa.py
to accommodate for this problem in the following way (it simply assumes the previous value instead of making it 0 or nan):
if not np.isnan(np.sqrt(np.mean(calc_rms(y, sc)**2))):
fluct[e] = np.sqrt(np.mean(calc_rms(y, sc)**2))
prevalue = fluct[e]
else:
fluct[e] = prevalue
Do you think this is a valid way of addressing this issue? I see that calc_rms
is not being calculated, so that's why there's problem I believe.
Here's the data I use (pandas exported as a CSV): https://www.dropbox.com/s/h1yhd45ht1fwe78/xsens_74.csv?dl=0
Hmm, your solution looks like a cheating to me. Why would you take the previous flutuation values for a scale that you cannot compute?
I think that instead you need to play with scale_lim
and scale_dens
parameters. Alternatively try to dive in what causes empty slice problem.
I am not too sure how the DFA is implemented here, but from what I can judge from line 75-76, we have
for e, sc in enumerate(scales):
fluct[e] = np.sqrt(np.mean(calc_rms(y, sc)**2))
You cannot avoid this problem. If your data varies too little, then DFA will simply not work.
On my own MFDFA code I have used instead np.power()
and np.float_power()
, which might perform slightly better for powers of zeros. You can try that if you wish.
In the end, you are only getting a warning, so the DFA is still performed. From the point of view on the physical interpretation, I would discourage your solution. copying the previous fluctuation will create a spurious "scaling" of your timeseries.