NeuroKit PPG_findpeaks detection failure with single spike artifact

PPG_findpeaks detection failure with single spike artifact

Open HotDog702 opened this issue 2 years ago • 4 comments

If PPG contains just a single (but big) spike, the ppg_findpeaks returns only single value: that spike position. The problem is in the threshold calculation algorithm: thr1 = ma_beat + beatoffset * np.mean(sqrd) np.mean(sqrd) is very sensitive to outliers

Nov 30 '21 12:11 HotDog702

Hi 👋 Thanks for reaching out and opening your first issue here! We'll try to come back to you as soon as possible. ❤️ kenobi

Nov 30 '21 12:11 welcome[bot]

The first idea is to divide signal for N parts and take minimum of np.mean(sqrd) between these parts, the another - use moving average with big window (2 sec or more)

Nov 30 '21 12:11 HotDog702

I see what you mean. It's true that our PPG only has one algo available, so it might not be the best for all the usecases like yours.

We would like to expand our PPG processing tools to support more methods for peak detection. However, our current approach is to add mostly algorithms that are published/implemented elsewhere.

If you know of other algorithms for PPG peak detection that has been published, validated or implemented elsewhere, or you have your own that you've internally benchmarked and tested, then we'd be happy if you could make a PR to add it as a new method!

Dec 02 '21 00:12 DominiqueMakowski

This issue has been automatically marked as inactive because it has not had recent activity. It will eventually be closed if no further activity occurs.

Jun 10 '22 17:06 stale[bot]

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

Sep 08 '22 17:09 stale[bot]

Hmm, 2 years after I'm facing again this issue. And the problem is still in the threshold's pedestal, that is calculated over the whole signal. So, the problem somewhere on the edge (lost contact with PPG sensor, for example) crashes the whole algorithm. Why not to change mean to median that is more robust to outliers? We can also analyze only sqrd > 0 to ignore negative portion of the signal: thr1 = ma_beat + beatoffset * np.median(sqrd[sqrd > 0]) And my above idea of splitting signal to chunks and taking minimum of means is also efficient. If you don't want to change the algorithm, can you add any of the above as a *kward option? Thank you

Oct 04 '23 12:10 HotDog702

Rather than changing exisitng algorithms, I'm wondering whether it wouldn't be easy to allow for a "scipy" method of peak detection that would simply call https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.find_peaks.html, and to which people can pass the *kwargs they want. This would allow some flexibility without the need to create a custom pipeline.

@danibene what do you think?

Oct 04 '23 13:10 DominiqueMakowski

@DominiqueMakowski I think a scipy method would be fine to add but I'm not sure if it would solve the issue here?

Seems like it is more of a question of when we want to:

Add parameters to modify existing algorithms
Add new methods for modifications of existing algorithms

And then what the "threshold" of this should be to add or change algorithms, if not published - some demonstration (e.g. evaluation on an open dataset) showing that it improves on an existing method?

Oct 04 '23 19:10 danibene

Well, the user can play with the scipy parameters to try to get something better than the existing methods for that specific signal, but I agree that the initial question is not really about that.

In general, I'm at this stage not super keen on adding new parameters to tweak existing methods just because there is one particular case where it fails. It feels arbitrary and would imo require validation studies.

hence the idea to enable the usage of a generic peak detection method like scipy's so that users can eventually compare that with signal-specific methods.

@HotDog702 to answer your issue in particular, TBH if I were you I would probably 1) identify the "bad" region (where the artifact is, 2) replace it with NaNs, 3) interpolate it to remove the spikes, and proceed as normal

Oct 05 '23 07:10 DominiqueMakowski

Actually, identification of the "bad" region is an another challenge in signal analysis. Not so easy either. scipy.signal.find_peaks is good and can be used instead of PPG_findpeaks, but it result is also not ideal.

About the main question. The origin of the problem - is nonstationary signal. In my case its amplitude can change dramatically, so defining amplitude thresholds from the whole signal is not correct. I understand you worry about unpublished changes in the algorithm. As some compromise I suggest to use the moving average with some big (symmetric) window instead of mean(.) of whole record: thr1 = ma_beat + beatoffset * _moving_average(sqrd, window) When window >= len(signal), we return to the original algorithm, otherwise, we get useful modification, that is less sensitive to local artifacts. P.S. moving_average(.) above - is some meta-code to illustrate the idea.

Oct 05 '23 12:10 HotDog702

NeuroKit NeuroKit copied to clipboard

PPG_findpeaks detection failure with single spike artifact

NeuroKit
NeuroKit copied to clipboard