dfa icon indicating copy to clipboard operation
dfa copied to clipboard

Getting a NaN / 0 error and mean of empty slice — possible fix?

Open deemeetree opened this issue 3 years ago • 2 comments

For my time series (where the deviations were very small and sometimes negligent) I was getting the following error:

/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py:3335: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)
/usr/local/lib/python3.6/dist-packages/numpy/core/_methods.py:161: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)

And my alpha component was nan (even though the chart was getting built, but the last value of fluct[e] was nAn.

So I changed the code in the dfa.py to accommodate for this problem in the following way (it simply assumes the previous value instead of making it 0 or nan):

    if not np.isnan(np.sqrt(np.mean(calc_rms(y, sc)**2))):
            fluct[e] = np.sqrt(np.mean(calc_rms(y, sc)**2))
            prevalue =  fluct[e]
        else:
            fluct[e] = prevalue

Do you think this is a valid way of addressing this issue? I see that calc_rms is not being calculated, so that's why there's problem I believe.

Here's the data I use (pandas exported as a CSV): https://www.dropbox.com/s/h1yhd45ht1fwe78/xsens_74.csv?dl=0

deemeetree avatar Nov 09 '20 16:11 deemeetree

Hmm, your solution looks like a cheating to me. Why would you take the previous flutuation values for a scale that you cannot compute?

I think that instead you need to play with scale_lim and scale_dens parameters. Alternatively try to dive in what causes empty slice problem.

dokato avatar Nov 11 '20 14:11 dokato

I am not too sure how the DFA is implemented here, but from what I can judge from line 75-76, we have

    for e, sc in enumerate(scales):
        fluct[e] = np.sqrt(np.mean(calc_rms(y, sc)**2))

You cannot avoid this problem. If your data varies too little, then DFA will simply not work. On my own MFDFA code I have used instead np.power() and np.float_power(), which might perform slightly better for powers of zeros. You can try that if you wish.

In the end, you are only getting a warning, so the DFA is still performed. From the point of view on the physical interpretation, I would discourage your solution. copying the previous fluctuation will create a spurious "scaling" of your timeseries.

LRydin avatar Nov 11 '20 14:11 LRydin