NeuroKit
NeuroKit copied to clipboard
`ecg_delineate` with `peak` method fails with `ValueError: cannot convert float NaN to integer`
This is a template for reporting a bug. You can remove it and write from scratch. These sections are a rough guide, but the important thing is to give enough details so that the developers can reproduce the bug on their machine and then investigate.
Describe the bug See dataset attached, ecg_delineator fails with ValueError. I don't know what exactly is going on but two things seems crucial to trigger the bug:
- clean the data with engzeemod2012
- delineate method
peak
See sample code below
/usr/local/lib/python3.9/site-packages/neurokit2/ecg/ecg_delineate.py in ecg_delineate(ecg_cleaned, rpeaks, sampling_rate, method, show, show_type, check)
118 method = method.lower() # remove capitalised letters
119 if method in ["peak", "peaks", "derivative", "gradient"]:
--> 120 waves = _ecg_delineator_peak(ecg_cleaned, rpeaks=rpeaks, sampling_rate=sampling_rate)
121 elif method in ["cwt", "continuous wavelet transform"]:
122 waves = _ecg_delineator_cwt(ecg_cleaned, rpeaks=rpeaks, sampling_rate=sampling_rate)
/usr/local/lib/python3.9/site-packages/neurokit2/ecg/ecg_delineate.py in _ecg_delineator_peak(ecg, rpeaks, sampling_rate)
685 except ImportError:
686 raise ImportError(
--> 687 "NeuroKit error: ecg_delineator(): the 'PyWavelets' module is required for this method to run. ",
688 "Please install it first (`pip install PyWavelets`).",
689 )
/usr/local/lib/python3.9/site-packages/neurokit2/ecg/ecg_segment.py in ecg_segment(ecg_cleaned, rpeaks, sampling_rate, show)
56 rpeaks=rpeaks, sampling_rate=sampling_rate, desired_length=len(ecg_cleaned)
57 )
---> 58 heartbeats = epochs_create(
59 ecg_cleaned, rpeaks, sampling_rate=sampling_rate, epochs_start=epochs_start, epochs_end=epochs_end
60 )
/usr/local/lib/python3.9/site-packages/neurokit2/epochs/epochs_create.py in epochs_create(data, events, sampling_rate, epochs_start, epochs_end, event_labels, event_conditions, baseline_correction)
121 # Find the maximum numbers of samples in an epoch
122 parameters["duration"] = list(np.array(parameters["end"]) - np.array(parameters["start"]))
--> 123 epoch_max_duration = int(max((i * sampling_rate for i in parameters["duration"])))
124
125 # Extend data by the max samples in epochs * NaN (to prevent non-complete data)
ValueError: cannot convert float NaN to integer
To Reproduce
import neurokit2 as nk
ecg = pd.read_csv(sample_ecg.csv')['channel'].values
clean_ecg = nk.ecg_clean(ecg, sampling_rate=250, method="engzeemod2012")
rp, rpeaks = nk.ecg_peaks(clean_ecg, sampling_rate=250)
wv, waves_peak = nk.ecg_delineate(clean_ecg, rpeaks["ECG_R_Peaks"], sampling_rate=250, show=True, method="peak")
Expected behaviour Expected array with detected waves but instead fails with ValueError
System Specifications
It's important that you give us some information about the system you are using. For that you can run:
>>> nk.version()
- OS: Darwin ( 64bit)
- Python: 3.9.6
- NeuroKit2: 0.1.2
- NumPy: 1.19.5
- Pandas: 1.2.3
- SciPy: 1.6.1
- sklearn: 0.24.2
- matplotlib: 3.4.2
Hi, can you update neurokit to the latest version (1.4.1) and try again? Thanks
Hi @timvlaer
I was able to replicate your errors with the latest version of NeuroKit. However, a particular problem that I noticed is in the ECG signal you provided.
import neurokit2 as nk
ecg = pd.read_csv('sample_ecg.csv')['channel'].values
clean_ecg_engz = nk.ecg_clean(ecg, sampling_rate=250, method="engzeemod2012")
clean_ecg_default = nk.ecg_clean(ecg, sampling_rate=250)
import matplotlib.pyplot as plt
fig= plt.figure()
plt.plot(ecg, label = 'raw')
plt.plot(clean_ecg_engz, label = 'engzeemod2012 cleaning')
plt.plot(clean_ecg_default, label = 'neurokit cleaning')
plt.legend()
The ECG signal here is simply too noisy to be cleaned as you can see in the outputs of both engzeemod2012
and neurokit
methods.
clean_ecg_engz = nk.ecg_clean(ecg, sampling_rate=250, method="engzeemod2012")
rp, rpeaks_engz = nk.ecg_peaks(clean_ecg_engz, sampling_rate=250)
Note that the input of rpeaks_engz
causes all delineate methods to fail as the problem lies upstream. Only 3 rpeaks were detected and they are too few to detect reliable heart rate and thus the nk.ecg_segment()
returns Nans. The problem here is that the rpeaks detected might not be at all reliable, looking at the state of the signal.
Even though using the neurokit
cleaning method might not cause an error in the delineation, I don't think the output can be used reliably.
clean_ecg_default = nk.ecg_clean(ecg, sampling_rate=250)
rp, rpeaks_default = nk.ecg_peaks(clean_ecg_default, sampling_rate=250)
wv, waves_peak = nk.ecg_delineate(clean_ecg_default, rpeaks_default["ECG_R_Peaks"], sampling_rate=250, show=True, method="peak")
Hi @Tam-Pham , thanks for reviewing. I share your thoughts, the signal is indeed complete rubbish. I was supprised to see the code crash on this particular example while I did expect an empty result in this case (no waves at all). Does that reasoning makes sense?
I'm reporting this bug to make the internals of the library robust against these weird cases. I'm fine with closing this ticket as 'won't fix' as this is a corner case.
I will definitely check the signals before trying to find peaks with Neurokit.
Hello, same issue is happening with me
This issue has been automatically marked as inactive because it has not had recent activity. It will eventually be closed if no further activity occurs.
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.