pywt icon indicating copy to clipboard operation
pywt copied to clipboard

Proof of concept on how to fix issues #531, #535 and #570

Open amanita-citrina opened this issue 5 years ago • 5 comments

This pull request is intended as a proof of concept to fix several issues of the CWT:

  • #531 Precision of CWT
  • #535 Complex Morlet wavelet not symmetric at fractional scales
  • #570 Allow user to set precision in CWT

The issues are due to some pecularities of the CWT implementation:

  • The wavelet function is sampled at 2**precision points. To compute the CWT, actual sampling instants are replaced by the values at the nearest (truncated) precomputed sampling instants. This yields inaccurate sampling values, which in turn results in all kinds of artefacts, as experienced in #531 and #570.
  • The computation includes an integration step (using np.cumsum), followed later on by a differentiation step (using np.diff). This results in a loss of accuracy due to the accumulation of values (see #570). As far as I understand, the integration/differentiation step is not necessary at all and can be omitted.
  • There is no provision to symmetrize sampling instants (see #535).

This pull request is intended as a starting point to remedy these problems. The attached plots show that the problems described in #531 and #535 are no longer visible.

  • The integration/differentation step has been removed.
  • Sample values are taken at the correct sampling instants.
  • If the range of the wavelet includes 0, samples are chosen such that 0 is a sampling instant.

Nevertheless, there are some open issues that have to be addressed:

  • Currently, the samples are interpolated from 2**precision samples of the wavelet, which is a computationally efficient solution. There seems to be no need to increase precision beyond 10. However, this creates a dependency on scipy.interpolate due to the cubic spline interpolation. There are at least two alternatives:
    • Apply linear interpolation with np.interp, which might impact accuracy
    • Evaluate the original wavelet, which likely increases computation time
  • The scaling seems correct, but is it?
  • Sampling instants might exclude upper and/or lower boundary points.
  • One pywt test fails (test_cwt_small_scales). Issue_531 Issue_535

amanita-citrina avatar Oct 12 '20 12:10 amanita-citrina

Thanks for looking at this @amanita-citrina. I will ping @OverLordGoldDragon, who has also been looking at CWT recently (e.g. #570), for awareness.

In general, it will be good for PyWavelets to have additional contributors working on CWT because I am personally more familiar with the DWT and SWT code.

There are probably other test cases comparing to legacy Matlab outputs that will fail here, but I don't know that we need to maintain that compatibility long-term going forward. Matlab's own CWT since 2015 or so has defaulted to a newer implementation and no longer matches the behavior that those cases tested against.

grlee77 avatar Oct 12 '20 17:10 grlee77

I'll take a closer look later - for now, I'm opposed to ridding of the integrated wavelet; it confers advantages over scipy's direct wavelet recomputing approach. In-depth here and here.

OverLordGoldDragon avatar Oct 12 '20 18:10 OverLordGoldDragon

Had a look; this looks more like a PR for scipy's CWT. Have a look at more comparisons here, also against another CWT I'm working on right now which may beat all (pywt, scipy, MATLAB) and suit your needs.

Low scales are indeed challenging, and pywt's integrated wavelet handles them better than scipy's recomputed approach; detailed comparison here. The singles-length wavelets for low scales exacerbate this problem. I haven't looked into fixing symmetry, but there'll be challenges at low scales due to sampling limitations; forcing symmetry may forbid some scales entirely (ssqueezepy's doesn't have this problem).

The interpolation idea for resampling seems to have potential, so I wouldn't toss the idea - just need to rework the resampling length. Behavior at low scales should be verified with test signals, for example as here (code provided); I haven't done it for your PR.

Whatever the case, I'd suggest against ridding of integrated wavelet, as it's a unique alternative to other implementations that works well and may be further improved. (And no, it's not quite undone by diff; also addressed here).

OverLordGoldDragon avatar Oct 14 '20 23:10 OverLordGoldDragon

My tentative pull request addressed three different issues, which in retrospect was not a great idea.

In my opinion, the most important issue is the artefacts at low scales (high frequencies). These are due to incorrect sampling instants (issue #531), easy to notice in a plot and very confusing. In fact, I gave up on pywt for my application because I did not trust the results.

There are several ways to get rid of the artefacts:

  • Increase precision, i.e., compute more samples of the wavelet to reduce the sampling offsets
  • Resample the original wavelet at the correct sampling instants
  • Interpolate from the precomputed samples

I favour interpolation (perhaps combined with someewhat increased precision) because it addresses the root problem - incorrect sampling instants - in a low-complexity way. Also, it is very easy to integrate in the current implementation. I pushed a branch in my fork (https://github.com/amanita-citrina/pywt/tree/Issue_531_Interpolation) to sketch how to do it. To avoid a dependency on scipy, I used linear interpolation (numpy.interp). With precision=10, this eliminates the visible artefacts of #531.

Some quick remarks on the other issues addressed in my pull request:

  • Integration of wavelet: This seems to be a deeper issues than I thought. I tried to understand the material @OverLordGoldDragon provided, but failed at some point. Therefore, I cannot argue against the current implementation with cumsum/diff.
  • Symmetry of sampling instants around 0: This might be more of an aesthetic problem when you convolve a symmetric wavelet with a Dirac function (impulse). Any regular sampling should be OK as long as there is no aliasing, i.e., the scale is sufficiently large. In any case, #535 should not distract from the main issue - incorrect sampling instants.

amanita-citrina avatar Oct 19 '20 14:10 amanita-citrina

@amanita-citrina Have you tried cwt_fwd? It excels at low scales and crushes pywt and scipy.

But yeah, tackling multiple issues in one PR isn't a safe bet.

OverLordGoldDragon avatar Oct 19 '20 21:10 OverLordGoldDragon