NeuroKit.py icon indicating copy to clipboard operation
NeuroKit.py copied to clipboard

NeuroKit Warning: ecg_hrv(): Correlation Dimension. Error: NeuroKit warning: complexity_entropy_multiscale(): Signal might be to short to compute SampEn for scale factors > 0. Setting max_scale_factor to 0.

Open Sanjay1995 opened this issue 6 years ago • 23 comments

I am giving the data in form of 1D numpy array which is basically i, ii, iii, v1, v2, v3, v4, v5, v6, avr, avl, avf features from PTB database. I call this function nk.bio_process(ecg=ecg_signal[:,i],ecg_quality_model=None). It operates good on several columns but gives error after 2 iterations. Moreover, I have checked data is fine enough in all columns. can anyone please resolve the issue

Sanjay1995 avatar Apr 14 '18 11:04 Sanjay1995

Hi @Sanjay1995, could you provide an example of your dataset? We'll try to fix the error. It might be related to a recent change in HRV computation (#58). Also linking @gattia just in case :)

DominiqueMakowski avatar Apr 14 '18 11:04 DominiqueMakowski

my dataset is ptb ecg database. you can check it on https://www.physionet.org/physiobank/database/ptbdb/

Sanjay1995 avatar Apr 14 '18 12:04 Sanjay1995

firstly you read the description. I want to know how to convert these twelve columns (i, ii, iii, avr, avl, avf, v1, v2, v3, v4, v5, v6) into features for classification of ecg using your library NeuroKit. It would be your thanks.

Sanjay1995 avatar Apr 14 '18 12:04 Sanjay1995

@DominiqueMakowski

Sanjay1995 avatar Apr 14 '18 12:04 Sanjay1995

Based on the print out it seems that the sample entropy is having problems on the very first pass at the full resolution scale for the multi-scale analysis (this should be the same as just running sample entropy on the full data). This shouldn't be a problem from anything done to the multiscale entropy function recently.

If I were debugging it, I'd be interested in what the data that is being passed to complexity_entropy_multiscale() looks like - whats it's shape, min, max values, how does it looked graphed out, etc.

gattia avatar Apr 14 '18 12:04 gattia

@Sanjay1995 As I understand it, you're basically trying to run the ECG processing routine on all of the ECG leads. However, the routine attempts, first, to extract R peaks, then computes several indices based on these R peaks (heart rate, HRV, and so on). The default cardiac complex segmenter works preferentially with LEAD 1 (i in your data). So I believe this is quite normal if it doesn't work with the other signals. It seems that you're trying to compute the same features based on different leads which are not appropriate for the traditional segmenting.

I am not sure what your end goal is, but neurokit's ecg routine currently preferentially works with LEAD 1 data (for extracting features then use them for whatever else), not comparing different leads between them. With that being said, you could use changing the default segmenter (ecg_segmenter = "hamilton", "gamboa", "engzee", "christov" or "ssf"). Critically, check if the R peaks were detected correctly. Also, try using "ecg_preprocess()" to simplify debugging.

I hope this was useful. Let me know of your progress,

@gattia thanks :)

DominiqueMakowski avatar Apr 14 '18 13:04 DominiqueMakowski

screenshot from 2018-04-14 18-23-07

Sanjay1995 avatar Apr 14 '18 13:04 Sanjay1995

this is how my one data column which I (i) as I mentioned above looks like

Sanjay1995 avatar Apr 14 '18 13:04 Sanjay1995

@gattia

Sanjay1995 avatar Apr 14 '18 13:04 Sanjay1995

thanks @DominiqueMakowski it is really useful, but tell me ecg_preprocess() and bio_process() works alike in my case?

Sanjay1995 avatar Apr 14 '18 13:04 Sanjay1995

@Sanjay1995 yes bio_process is just a wrapper for processing multiple signals (ECG, EDA, EMG etc.) at once. Using bio-process with only ecg is similar to using ecg_process. However, ecg_process uses itself, the ecg_preprocess function that only does low level preprocessing (mainly extracting R peaks and not computing more complex indices such as HRV for example).

DominiqueMakowski avatar Apr 14 '18 13:04 DominiqueMakowski

as I have already mentioned you that I am using dataset of PTB ecg dataset, and your library also inculcates it. But ecg_preprocess() also fails on some signals giving error (index 0 is out of bounds for axis 0) don't know why.

Sanjay1995 avatar Apr 14 '18 13:04 Sanjay1995

If I only process LEAD 1 (i in my data), then would it give features which ('T_Waves', 'Cardiac_Cycles', 'P_Waves', 'Q_Waves', 'HRV', 'R_Peaks') helps me in classification of heart disease class.

Sanjay1995 avatar Apr 14 '18 13:04 Sanjay1995

yes, you should use only the column of the dataset corresponding to i. I used the full PTB dataset only to create a machine learning model that automatically classifies the provided lead signal and returns the probability of correct classification (a proxy of signal quality). But for investigating ECG features using only LEAD I is sufficient.

DominiqueMakowski avatar Apr 14 '18 14:04 DominiqueMakowski

@DominiqueMakowski thanks.

Sanjay1995 avatar Apr 14 '18 15:04 Sanjay1995

@DominiqueMakowski I am getting same error ecg_process index 0 is out of bound error but I have the signal of length greater than 1

waleedkaimkhani avatar Apr 14 '18 17:04 waleedkaimkhani

@waleedkaimkhani could you provide a sample of your data? thanks

DominiqueMakowski avatar Apr 14 '18 17:04 DominiqueMakowski

my dataset is ptb ecg database. you can check it on https://www.physionet.org/physiobank/database/ptbdb/

waleedkaimkhani avatar Apr 14 '18 17:04 waleedkaimkhani

haha alright;

  1. did you select correctly a one dimensional array OR one pandas' dataframe column (corresponding to LEAD 1)?
  2. if yes, could you save (in txt, csv or json) this unique column or array and attach it here so I can check directly with the exact input you provide to neurokit's routines? Thanks 😅

DominiqueMakowski avatar Apr 14 '18 17:04 DominiqueMakowski

screenshot from 2018-04-14 22 35 00

waleedkaimkhani avatar Apr 14 '18 17:04 waleedkaimkhani

or send it to me [email protected]

DominiqueMakowski avatar Apr 14 '18 17:04 DominiqueMakowski

@DominiqueMakowski i have sent you mail

waleedkaimkhani avatar Apr 14 '18 17:04 waleedkaimkhani

@waleedkaimkhani your code should look like that

import neurokit as nk
import pandas as pd

df = pd.read_csv("file.csv")
ecg_processed = nk.ecg_process(ecg=df["i"], sampling_rate=1000)

DominiqueMakowski avatar Apr 17 '18 13:04 DominiqueMakowski